I have a problem with the embedded bitcode term.
What is embedded bitcode?
When to enable, ENABLE_BITCODE
in new Xcode?
What happens to the binary when enabled, ENABLE_BITCODE
in Xcode 7?
Bitcode refers to to the type of code: "LLVM Bitcode" that is sent to iTunes Connect. This allows Apple to use certain calculations to re-optimize apps further (e.g: possibly downsize executable sizes). If Apple needs to alter your executable then they can do this without a new build being uploaded.
This differs from: Slicing which is the process of Apple optimizing your app for a user's device based on the device's resolution and architecture. Slicing does not require Bitcode. (Ex: only including @2x images on a 5s)
App Thinning is the combination of slicing, bitcode, and on-demand resources
Bitcode is an intermediate representation of a compiled program. Apps you upload to iTunes Connect that contain bitcode will be compiled and linked on the App Store. Including bitcode will allow Apple to re-optimize your app binary in the future without the need to submit a new version of your app to the store.
if true { // code to be run }
in your app then they optimize it by replacing it with // code to be run
). Originally with the app's compressed executable this wouldn't be possible. But now since we give them an intermediate version they can rebuild the executable with the optimization in place. –
Codpiece What is embedded bitcode?
According to docs:
Bitcode is an intermediate representation of a compiled program. Apps you upload to iTunes Connect that contain bitcode will be compiled and linked on the App Store. Including bitcode will allow Apple to re-optimize your app binary in the future without the need to submit a new version of your app to the store.
Update: This phrase in "New Features in Xcode 7" made me to think for a long time that Bitcode is needed for Slicing to reduce app size:
When you archive for submission to the App Store, Xcode will compile your app into an intermediate representation. The App Store will then compile the bitcode down into the 64 or 32 bit executables as necessary.
However that's not true, Bitcode and Slicing work independently: Slicing is about reducing app size and generating app bundle variants, and Bitcode is about certain binary optimizations. I've verified this by checking included architectures in executables of non-bitcode apps and founding that they only include necessary ones.
Bitcode allows other App Thinning component called Slicing to generate app bundle variants with particular executables for particular architectures, e.g. iPhone 5S variant will include only arm64 executable, iPad Mini armv7 and so on.
When to enable ENABLE_BITCODE in new Xcode?
For iOS apps, bitcode is the default, but optional. If you provide bitcode, all apps and frameworks in the app bundle need to include bitcode. For watchOS and tvOS apps, bitcode is required.
What happens to the binary when ENABLE_BITCODE is enabled in the new Xcode?
From Xcode 7 reference:
Activating this setting indicates that the target or project should generate bitcode during compilation for platforms and architectures which support it. For Archive builds, bitcode will be generated in the linked binary for submission to the app store. For other builds, the compiler and linker will check whether the code complies with the requirements for bitcode generation, but will not generate actual bitcode.
Here's a couple of links that will help in deeper understanding of Bitcode:
Since the exact question is "what does enable bitcode do", I'd like to give a few thin technical details I've figured out thus far. Most of this is practically impossible to figure out with 100% certainty until Apple releases the source code for this compiler
First, Apple's bitcode does not appear to be the same thing as LLVM bytecode. At least, I've not been able to figure out any resemblance between them. It appears to have a proprietary header (always starts with "xar!") and probably some link-time reference magic that prevents data duplications. If you write out a hardcoded string, this string will only be put into the data once, rather than twice as would be expected if it was normal LLVM bytecode.
Second, bitcode is not really shipped in the binary archive as a separate architecture as might be expected. It is not shipped in the same way as say x86 and ARM are put into one binary (FAT archive). Instead, they use a special section in the architecture specific MachO binary named "__LLVM" which is shipped with every architecture supported (ie, duplicated). I assume this is a short coming with their compiler system and may be fixed in the future to avoid the duplication.
C code (compiled with clang -fembed-bitcode hi.c -S -emit-llvm
):
#include <stdio.h>
int main() {
printf("hi there!");
return 0;
}
LLVM IR output:
; ModuleID = '/var/folders/rd/sv6v2_f50nzbrn4f64gnd4gh0000gq/T/hi-a8c16c.bc'
target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-apple-macosx10.10.0"
@.str = private unnamed_addr constant [10 x i8] c"hi there!\00", align 1
@llvm.embedded.module = appending constant [1600 x i8] c"\DE\C0\17\0B\00\00\00\00\14\00\00\00$\06\00\00\07\00\00\01BC\C0\DE!\0C\00\00\86\01\00\00\0B\82 \00\02\00\00\00\12\00\00\00\07\81#\91A\C8\04I\06\1029\92\01\84\0C%\05\08\19\1E\04\8Bb\80\10E\02B\92\0BB\84\102\148\08\18I\0A2D$H\0A\90!#\C4R\80\0C\19!r$\07\C8\08\11b\A8\A0\A8@\C6\F0\01\00\00\00Q\18\00\00\C7\00\00\00\1Bp$\F8\FF\FF\FF\FF\01\90\00\0D\08\03\82\1D\CAa\1E\E6\A1\0D\E0A\1E\CAa\1C\D2a\1E\CA\A1\0D\CC\01\1E\DA!\1C\C8\010\87p`\87y(\07\80p\87wh\03s\90\87ph\87rh\03xx\87tp\07z(\07yh\83r`\87th\07\80\1E\E4\A1\1E\CA\01\18\DC\E1\1D\DA\C0\1C\E4!\1C\DA\A1\1C\DA\00\1E\DE!\1D\DC\81\1E\CAA\1E\DA\A0\1C\D8!\1D\DA\A1\0D\DC\E1\1D\DC\A1\0D\D8\A1\1C\C2\C1\1C\00\C2\1D\DE\A1\0D\D2\C1\1D\CCa\1E\DA\C0\1C\E0\A1\0D\DA!\1C\E8\01\1D\00s\08\07v\98\87r\00\08wx\876p\87pp\87yh\03s\80\876h\87p\A0\07t\00\CC!\1C\D8a\1E\CA\01 \E6\81\1E\C2a\1C\D6\A1\0D\E0A\1E\DE\81\1E\CAa\1C\E8\E1\1D\E4\A1\0D\C4\A1\1E\CC\C1\1C\CAA\1E\DA`\1E\D2A\1F\CA\01\C0\03\80\A0\87p\90\87s(\07zh\83q\80\87z\00\C6\E1\1D\E4\A1\1C\E4\00 \E8!\1C\E4\E1\1C\CA\81\1E\DA\C0\1C\CA!\1C\E8\A1\1E\E4\A1\1C\E6\01X\83y\98\87y(\879`\835\18\07|\88\03;`\835\98\87y(\076X\83y\98\87r\90\036X\83y\98\87r\98\03\80\A8\07w\98\87p0\87rh\03s\80\876h\87p\A0\07t\00\CC!\1C\D8a\1E\CA\01 \EAa\1E\CA\A1\0D\E6\E1\1D\CC\81\1E\DA\C0\1C\D8\E1\1D\C2\81\1E\00s\08\07v\98\87r\006\C8\88\F0\FF\FF\FF\FF\03\C1\0E\E50\0F\F3\D0\06\F0 \0F\E50\0E\E90\0F\E5\D0\06\E6\00\0F\ED\10\0E\E4\00\98C8\B0\C3<\94\03@\B8\C3;\B4\819\C8C8\B4C9\B4\01<\BCC:\B8\03=\94\83<\B4A9\B0C:\B4\03@\0F\F2P\0F\E5\00\0C\EE\F0\0Em`\0E\F2\10\0E\EDP\0Em\00\0F\EF\90\0E\EE@\0F\E5 \0FmP\0E\EC\90\0E\ED\D0\06\EE\F0\0E\EE\D0\06\ECP\0E\E1`\0E\00\E1\0E\EF\D0\06\E9\E0\0E\E60\0Fm`\0E\F0\D0\06\ED\10\0E\F4\80\0E\809\84\03;\CCC9\00\84;\BCC\1B\B8C8\B8\C3<\B4\819\C0C\1B\B4C8\D0\03:\00\E6\10\0E\EC0\0F\E5\00\10\F3@\0F\E10\0E\EB\D0\06\F0 \0F\EF@\0F\E50\0E\F4\F0\0E\F2\D0\06\E2P\0F\E6`\0E\E5 \0Fm0\0F\E9\A0\0F\E5\00\E0\01@\D0C8\C8\C39\94\03=\B4\C18\C0C=\00\E3\F0\0E\F2P\0Er\00\10\F4\10\0E\F2p\0E\E5@\0Fm`\0E\E5\10\0E\F4P\0F\F2P\0E\F3\00\AC\C1<\CC\C3<\94\C3\1C\B0\C1\1A\8C\03>\C4\81\1D\B0\C1\1A\CC\C3<\94\03\1B\AC\C1<\CCC9\C8\01\1B\AC\C1<\CCC9\CC\01@\D4\83;\CCC8\98C9\B4\819\C0C\1B\B4C8\D0\03:\00\E6\10\0E\EC0\0F\E5\00\10\F50\0F\E5\D0\06\F3\F0\0E\E6@\0Fm`\0E\EC\F0\0E\E1@\0F\809\84\03;\CCC9\00\00I\18\00\00\02\00\00\00\13\82`B \00\00\00\89 \00\00\0D\00\00\002\22\08\09 d\85\04\13\22\A4\84\04\13\22\E3\84\A1\90\14\12L\88\8C\0B\84\84L\100s\04H*\00\C5\1C\01\18\94`\88\08\AA0F7\10@3\02\00\134|\C0\03;\F8\05;\A0\836\08\07x\80\07v(\876h\87p\18\87w\98\07|\88\038p\838\80\037\80\83\0DeP\0Em\D0\0Ez\F0\0Em\90\0Ev@\07z`\07t\D0\06\E6\80\07p\A0\07q \07x\D0\06\EE\80\07z\10\07v\A0\07s \07z`\07t\D0\06\B3\10\07r\80\07:\0FDH #EB\80\1D\8C\10\18I\00\00@\00\00\C0\10\A7\00\00 \00\00\00\00\00\00\00\868\08\10\00\02\00\00\00\00\00\00\90\05\02\00\00\08\00\00\002\1E\98\0C\19\11L\90\8C\09&G\C6\04C\9A\22(\01\0AM\D0i\10\1D]\96\97C\00\00\00y\18\00\00\1C\00\00\00\1A\03L\90F\02\134A\18\08&PIC Level\13\84a\D80\04\C2\C05\08\82\83c+\03ab\B2j\02\B1+\93\9BK{s\03\B9q\81q\81\01A\19c\0Bs;k\B9\81\81q\81q\A9\99q\99I\D9\10\14\8D\D8\D8\EC\DA\5C\DA\DE\C8\EA\D8\CA\5C\CC\D8\C2\CE\E6\A6\04C\1566\BB6\974\B227\BA)A\01\00y\18\00\002\00\00\003\08\80\1C\C4\E1\1Cf\14\01=\88C8\84\C3\8CB\80\07yx\07s\98q\0C\E6\00\0F\ED\10\0E\F4\80\0E3\0CB\1E\C2\C1\1D\CE\A1\1Cf0\05=\88C8\84\83\1B\CC\03=\C8C=\8C\03=\CCx\8Ctp\07{\08\07yH\87pp\07zp\03vx\87p \87\19\CC\11\0E\EC\90\0E\E10\0Fn0\0F\E3\F0\0E\F0P\0E3\10\C4\1D\DE!\1C\D8!\1D\C2a\1Ef0\89;\BC\83;\D0C9\B4\03<\BC\83<\84\03;\CC\F0\14v`\07{h\077h\87rh\077\80\87p\90\87p`\07v(\07v\F8\05vx\87w\80\87_\08\87q\18\87r\98\87y\98\81,\EE\F0\0E\EE\E0\0E\F5\C0\0E\EC\00q \00\00\05\00\00\00&`<\11\D2L\85\05\10\0C\804\06@\F8\D2\14\01\00\00a \00\00\0B\00\00\00\13\04A,\10\00\00\00\03\00\00\004#\00dC\19\020\18\83\01\003\11\CA@\0C\83\11\C1\00\00#\06\04\00\1CB\12\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00", section "__LLVM,__bitcode"
@llvm.cmdline = appending constant [67 x i8] c"-triple\00x86_64-apple-macosx10.10.0\00-emit-llvm\00-disable-llvm-optzns\00", section "__LLVM,__cmdline"
; Function Attrs: nounwind ssp uwtable
define i32 @main() #0 {
%1 = alloca i32, align 4
store i32 0, i32* %1
%2 = call i32 (i8*, ...)* @printf(i8* getelementptr inbounds ([10 x i8]* @.str, i32 0, i32 0))
ret i32 0
}
declare i32 @printf(i8*, ...) #1
attributes #0 = { nounwind ssp uwtable "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="core2" "target-features"="+ssse3,+cx16,+sse,+sse2,+sse3" "unsafe-fp-math"="false" "use-soft-float"="false" }
attributes #1 = { "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="core2" "target-features"="+ssse3,+cx16,+sse,+sse2,+sse3" "unsafe-fp-math"="false" "use-soft-float"="false" }
!llvm.module.flags = !{!0}
!llvm.ident = !{!1}
!0 = !{i32 1, !"PIC Level", i32 2}
!1 = !{!"Apple LLVM version 7.0.0 (clang-700.0.53.3)"}
The data array that is in the IR also changes depending on the optimization and other code generation settings of clang. It's completely unknown to me what format or anything that this is in.
EDIT:
Following the hint on Twitter, I decided to revisit this and to confirm it. I followed this blog post and used his bitcode extractor tool to get the Apple Archive binary out of the MachO executable. And after extracting the Apple Archive with the xar utility, I got this (converted to text with llvm-dis of course)
; ModuleID = '1'
target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-apple-macosx10.10.0"
@.str = private unnamed_addr constant [10 x i8] c"hi there!\00", align 1
; Function Attrs: nounwind ssp uwtable
define i32 @main() #0 {
%1 = alloca i32, align 4
store i32 0, i32* %1
%2 = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([10 x i8], [10 x i8]* @.str, i32 0, i32 0))
ret i32 0
}
declare i32 @printf(i8*, ...) #1
attributes #0 = { nounwind ssp uwtable "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="core2" "target-features"="+ssse3,+cx16,+sse,+sse2,+sse3" "unsafe-fp-math"="false" "use-soft-float"="false" }
attributes #1 = { "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="core2" "target-features"="+ssse3,+cx16,+sse,+sse2,+sse3" "unsafe-fp-math"="false" "use-soft-float"="false" }
!llvm.module.flags = !{!0}
!llvm.ident = !{!1}
!0 = !{i32 1, !"PIC Level", i32 2}
!1 = !{!"Apple LLVM version 7.0.0 (clang-700.1.76)"}
The only notable difference really between the non-bitcode IR and the bitcode IR is that filenames have been stripped to just 1, 2, etc for each architecture.
I also confirmed that the bitcode embedded in a binary is generated after optimizations. If you compile with -O3 and extract out the bitcode, it'll be different than if you compile with -O0.
And just to get extra credit, I also confirmed that Apple does not ship bitcode to devices when you download an iOS 9 app. They include a number of other strange sections that I don't recognized like __LINKEDIT, but they do not include __LLVM.__bundle, and thus do not appear to include bitcode in the final binary that runs on a device. Oddly enough, Apple still ships fat binaries with separate 32/64bit code to iOS 8 devices though.
xar!
is Apple's archive file format. –
Kettering Bitcode (iOS, watchOS)
Bitcode is an intermediate representation of a compiled program. Apps you upload to iTunes Connect that contain bitcode will be compiled and linked on the App Store. Including bitcode will allow Apple to re-optimize your app binary in the future without the need to submit a new version of your app to the store.
Basically this concept is somewhat similar to java where byte code is run on different JVM's and in this case the bitcode is placed on iTune store and instead of giving the intermediate code to different platforms(devices) it provides the compiled code which don't need any virtual machine to run.
Thus we need to create the bitcode once and it will be available for existing or coming devices. It's the Apple's headache to compile an make it compatible with each platform they have.
Devs don't have to make changes and submit the app again to support new platforms.
Let's take the example of iPhone 5s when apple introduced x64
chip in it. Although x86
apps were totally compatible with x64
architecture but to fully utilise the x64
platform the developer has to change the architecture or some code. Once s/he's done the app is submitted to the app store for the review.
If this bitcode concept was launched earlier then we the developers doesn't have to make any changes to support the x64
bit architecture.
Update
Apple has clarified that slicing occurs independent of enabling bitcode. I've observed this in practice as well where a non-bitcode enabled app will only be downloaded as the architecture appropriate for the target device.
Original
Bitcode. Archive your app for submission to the App Store in an intermediate representation, which is compiled into 64- or 32-bit executables for the target devices when delivered.
Slicing. Artwork incorporated into the Asset Catalog and tagged for a platform allows the App Store to deliver only what is needed for installation.
The way I read this, if you support bitcode, downloaders of your app will only get the compiled architecture needed for their own device.
Bitcode official page
[Intermediate Representation (IR) in LLVM] has three forms:
- human readable assembly language representation
- in-memory compiler IR
Bitcode
(on-disk bitcode representation, bitcode file format, binary format).
It is bitstream(binary encoding) file format for LLVM IR. It is a result of LLVM IR serialization. It can be optionally embedded into Wrapper or Native Object File(Mach-O
inside Raw segment data[About]). It is suitable for Just-In-Time compiler.
You are able to convert bitcode
IR into human readable IR using llvm-dis
Another advantage which Apple uses is a possibility of recompiling binary for another(new) architecture(instruction set architecture (ISA)
) without developer attention. Also as a small additional you have a possibility to reverse engineering, which allows Apple to analize binary easier, but on the other hand it is a disadvantage which can be used by malefactor. Also it increase build time
When you build bitcode .BCSymbolMap
[About] also is generated for analizing error stack traces
Please note that bitcode is not generated for simulator(arch x86_64). Xcode uses bitcode in next scenarios:
Flags:
-fembed-bitcode
- embed bitcode-fembed-bitcode-marker
- just mark where it will be located.__LLVM
segment is empty, without any data
Using:
Enable Bitcode
(ENABLE_BITCODE
). YES - Is default for App, framework targets- uses
-fembed-bitcode-marker
for regular build - uses
-fembed-bitcode
embeds bitcode in archive(Product -> Archive) or (xcodebuild archive)
- uses
Add flag explicitly to
Other C Flags
(OTHER_CFLAGS
)User-Defined Setting
BITCODE_GENERATION_MODE
marker
- adds-fembed-bitcode-marker
bitcode
- adds-fembed-bitcode
xcodebuild
with appropriate options above
//please make sure that this settings is placed before xcodebuild params(.e.g. -workspace, -scheme...)
xcodebuild ENABLE_BITCODE=YES
//or
xcodebuild BITCODE_GENERATION_MODE="bitcode"
//or
xcodebuild OTHER_CFLAGS="-fembed-bitcode"
If you use embed bitcode
in app but not all libraries support it you get
ld: bitcode bundle could not be generated because '<path>' was built without full bitcode. All frameworks and dylibs for bitcode must be generated from Xcode Archive or Install build file '<path>' for architecture <arch>
Check if binary contains bitcode
The bitcode must be stored in a section of the object file named __LLVM,__bitcode for MachO and .llvmbc for the other object formats.
Bitcode injects into __LLVM
segment three sections: __bitcode
, __cmdline
, __asm
. Apple's version of LLVM uses a little bit different logic and moves __bitcode
and __cmdline
into __bundle
section as .xar
archive.
eXtensible ARchive(XAR)
- .xar, .pkg archiver's file format which consists of header, table of contents(toc), heap. TOC is for random access to archived files. Every file in xar is independently compressed
otool -l
and find __LLVM __bundle.
You can check segment name and section name in Mach-O file
But it does not guarantee that bitcode is included(e.g. marker)
//<segname> <sectname> e.g. __LLVM __bundle. They are started from __
otool -l "/Users/alex/MyModule.framework/MyModule"
//or universal framework(specify arch)
otool -arch arm64 -l "/Users/alex/MyModule.framework/MyModule"
//or all arch
otool -arch all -l "/Users/alex/MyModule.framework/MyModule"
//-l print the load commands
output:
Section
sectname __bundle
segname __LLVM
addr 0x00000000000c0000
size 0x00000000003af3ce
offset 770048
...
otool -v -s __LLVM __bundle
otool -v -s __LLVM __bundle <binary_path>
//e.g.
otool -v -s __LLVM __bundle "/Users/alex/MyModule.framework/MyModule"
// -s <segname> <sectname> print contents of section. e.g. -s __LLVM __bundle
// -v print verbosely (symbolically) when possible
output for otool -s __LLVM __bundle. It is bitstream(binary encoding)
Contents of (__LLVM,__bundle) section
00000000000b4000 21726178 01001c00 00000000 c60d0000
00000000000b4010 00000000 be860000 01000000 9decda78
00000000000b4020 b6dc735b f3dfc715 5f7a3429 bdc1ce2f
output for otool -v -s __LLVM __bundle. It is XAR's table of content(TOC). -v
Converts bitstream(binary encoding) to XML format of XAR's table of content(TOC)
For (__LLVM,__bundle) section: xar table of contents:
<?xml version="1.0" encoding="UTF-8"?>
<xar>
<subdoc subdoc_name="Ld">
<version>1.0</version>
...
- One more indicator is generated
.bcsymbolmap
[About]
Find and extract bitcode
Closed source Library developer - XCFramework
App developer - enable bitcode
Is bitcode mandatory Official
For iOS apps, bitcode is the default, but optional. For watchOS and tvOS apps, bitcode is required.
Binary size
Bitcode increases binary size, when it is not mandatory you can remove bitcode manually from binary using bitcode_strip
For example
xcrun bitcode_strip -r "/Users/alex/MyModule.framework/MyModule" -o "/Users/alex/MyModule.framework/MyModule"
// -r remove bitcode
// -o output file name
© 2022 - 2024 — McMap. All rights reserved.