How does the new automatic reference counting mechanism work?
Asked Answered
P

6

212

Can someone briefly explain to me how ARC works? I know it's different from Garbage Collection, but I was just wondering exactly how it worked.

Also, if ARC does what GC does without hindering performance, then why does Java use GC? Why doesn't it use ARC as well?

Penology answered 17/6, 2011 at 11:40 Comment(7)
This will tell you all about it: http://clang.llvm.org/docs/AutomaticReferenceCounting.html How it's implemented in Xcode and iOS 5 is under NDA.Relict
@mbehan That's poor advice. I don't want to log in or even have an account for iOS dev center, but I'm interested in knowing about ARC nevertheless.Gibeonite
ARC does not do everything that GC does, it requires you to work with strong and weak reference semantics explicitly, and leaks memory if you don't get those right. In my experience, this is at first tricky when you use blocks in Objective-C, and even after you learn of the tricks you're left with some annoying (IMO) boilerplate code around many usages of blocks. It's more convenient to just forget about strong/weak references. Moreover, GC can perform somewhat better than ARC wrt. CPU, but requires more memory. It can be faster than explicit memory management when you have a lot of memory.Toffey
@TaylanUB: "requires more memory". A lot of people say that but I find it difficult to believe.Fruitless
@mbehan: "The accepted answer here is a much better answer to the question though". The accepted answer is mostly wrong.Fruitless
@JonHarrop: Currently I don't even remember why I said that, to be honest. :-) On the meanwhile I realized that there are so many different GC strategies that such blanket statements are probably all worthless. Let me recite Hans Boehm from his Memory Allocation Myths and Half-Truths: "Why is this area so prone to dubious folk-wisdoms?"Toffey
Deleted a couple of comments from 2 years ago that people were giving out about but here's the link I was providing, which I still think is a very useful introduction to the subject: developer.apple.com/library/ios/releasenotes/General/…Giacobo
D
248

Every new developer who comes to Objective-C has to learn the rigid rules of when to retain, release, and autorelease objects. These rules even specify naming conventions that imply the retain count of objects returned from methods. Memory management in Objective-C becomes second nature once you take these rules to heart and apply them consistently, but even the most experienced Cocoa developers slip up from time to time.

With the Clang Static Analyzer, the LLVM developers realized that these rules were reliable enough that they could build a tool to point out memory leaks and overreleases within the paths that your code takes.

Automatic reference counting (ARC) is the next logical step. If the compiler can recognize where you should be retaining and releasing objects, why not have it insert that code for you? Rigid, repetitive tasks are what compilers and their brethren are great at. Humans forget things and make mistakes, but computers are much more consistent.

However, this doesn't completely free you from worrying about memory management on these platforms. I describe the primary issue to watch out for (retain cycles) in my answer here, which may require a little thought on your part to mark weak pointers. However, that's minor when compared to what you're gaining in ARC.

When compared to manual memory management and garbage collection, ARC gives you the best of both worlds by cutting out the need to write retain / release code, yet not having the halting and sawtooth memory profiles seen in a garbage collected environment. About the only advantages garbage collection has over this are its ability to deal with retain cycles and the fact that atomic property assignments are inexpensive (as discussed here). I know I'm replacing all of my existing Mac GC code with ARC implementations.

As to whether this could be extended to other languages, it seems geared around the reference counting system in Objective-C. It might be difficult to apply this to Java or other languages, but I don't know enough about the low-level compiler details to make a definitive statement there. Given that Apple is the one pushing this effort in LLVM, Objective-C will come first unless another party commits significant resources of their own to this.

The unveiling of this shocked developers at WWDC, so people weren't aware that something like this could be done. It may appear on other platforms over time, but for now it's exclusive to LLVM and Objective-C.

Duax answered 20/6, 2011 at 22:43 Comment(14)
emphasis mine: this doesn't completely free you from worrying about memory managementUttasta
Thanks for the reply. Can you give a quick example of a case where you would need manual mem management? ThanksPenology
@Penology - For the memory management conditions that you'll still need to be aware of while under ARC, see my answer that I link to above: #6260756Duax
Is ARC really an innovation? From your answer I conclude that ARC is a new concept, which is used in Objective-C for the first time (correct me if I'm wrong). To be honest, I'm not an Objective-C developer and don't know much about ARC, but are Boost Shared Pointers (see boost.org) not exactly the same thing? And if they aren't, what is the difference?Bodice
@DMM - Rather than relying on overloaded operators (as Boost does), this is a compiler-level process, which extends it across the whole language. Among other things, this makes it easy to convert a manually reference counted application to ARC. Boost might also handle local variables differently than ARC does, where ARC knows the moment that a local variable is no longer being used and can release at that point. I believe that with Boost you still need to specify in some way that you are done with the variable.Duax
@Brad - I see, thanks for the clarification. Switching a C++ application from raw to boost shared pointers is indeed not trivial, not to say impossible. But the last point you mention is not true: You don't need to explicitly release a boost pointer. The pointer itself resides on the stack (or is an embedded member), so it will be deleted on scope exit. Then, the destructor runs and performs the smart pointer trickery.Bodice
To answer the "is it new" question, Delphi has had automatic reference counting for strings, arrays and interfaces (for COM support) for well over a decade. I agree that it really is a nice compromise between a gc'd environment and a "do it all manually" environment. I'm glad it's in ObjC and LLVM (so other languages can take advantage of it as well).Hsining
Whether Apple's ARC is actually innovation or not depends on what else you could consider ARC. For example, AnsiStrings are Reference Counted in Delphi and FreePascal - Automatically. ARC is more generic, and more limited as well. It was also more needed in ObjectiveC than in some other languages.Immovable
Thank you for explaining ARC in such clear terms. I'd love to have much more of Objective-C/Xcode concepts explained so clearly.Pumpkinseed
@BradLarson: "I believe that with Boost you still need to specify in some way that you are done with the variable". No, when the local variable falls out of scope the smart pointer's destructor decrements the reference count of the object it was pointing to and, if that count has reached zero, the object itself is destructed.Fruitless
@theDmi: "Is ARC really an innovation?". Automatic reference counting was invented in 1960 and has been used in many languages such as Python and Mathematica. It is not used in the JVM or CLR because it is very slow and leaks cycles.Fruitless
@BradLarson: "About the only advantages garbage collection has over this are its ability to deal with retain cycles and the fact that atomic property assignments are inexpensive". And tracing garbage collectors are about 10x faster than scope-based reference counting. flyingfrogblog.blogspot.co.uk/2011/01/…Fruitless
How does ARC cope with memory fragmentation?Files
Clearly the reference count must be updated using an atomic operation; otherwise it could be damaged if accessed by multiple threads on a multiple core system. This generates lots of bus traffic between cores, hence can be very slow. Garbage collected environment can therefore be a lot faster for multi threaded software running on many cores.Files
S
27

ARC is just play old retain/release (MRC) with the compiler figuring out when to call retain/release. It will tend to have higher performance, lower peak memory use, and more predictable performance than a GC system.

On the other hand some types of data structure are not possible with ARC (or MRC), while GC can handle them.

As an example, if you have a class named node, and node has an NSArray of children, and a single reference to its parent that "just works" with GC. With ARC (and manual reference counting as well) you have a problem. Any given node will be referenced from its children and also from its parent.

Like:

A -> [B1, B2, B3]
B1 -> A, B2 -> A, B3 -> A

All is fine while you are using A (say via a local variable).

When you are done with it (and B1/B2/B3), a GC system will eventually decide to look at everything it can find starting from the stack and CPU registers. It will never find A,B1,B2,B3 so it will finalize them and recycle the memory into other objects.

When you use ARC or MRC, and finish with A it have a refcount of 3 (B1, B2, and B3 all reference it), and B1/B2/B3 will all have a reference count of 1 (A's NSArray holds one reference to each). So all of those objects remain live even though nothing can ever use them.

The common solution is to decide one of those references needs to be weak (not contribute to the reference count). That will work for some usage patterns, for example if you reference B1/B2/B3 only via A. However in other patterns it fails. For example if you will sometimes hold onto B1, and expect to climb back up via the parent pointer and find A. With a weak reference if you only hold onto B1, A can (and normally will) evaporate, and take B2, and B3 with it.

Sometimes this isn't an issue, but some very useful and natural ways of working with complex structures of data are very difficult to use with ARC/MRC.

So ARC targets the same sort of problems GC targets. However ARC works on a more limited set of usage patterns then GC, so if you took a GC language (like Java) and grafted something like ARC onto it some programs wouldn't work any more (or at least would generate tons of abandoned memory, and may cause serious swapping issues or run out of memory or swap space).

You can also say ARC puts a bigger priority on performance (or maybe predictability) while GC puts a bigger priority on being a generic solution. As a result GC has less predictable CPU/memory demands, and a lower performance (normally) than ARC, but can handle any usage pattern. ARC will work much better for many many common usage patterns, but for a few (valid!) usage patterns it will fall over and die.

Stereotomy answered 22/11, 2011 at 8:15 Comment(10)
"On the other hand some types of data structure are not possible with ARC" I think you meant automatic cleanup is not possible without hints; obviously, the data structures are.Bermudez
Sure, but ONLY automatic cleanup of ObjC objects is available under ARC so "no automatic cleanup" == "no cleanup". I'll reword then answer when I have more time though.Stereotomy
@Stripes: the equivalent of manual cleanup in ARC is manual breaking of cycles, eg foo = nil.Franckot
"[ARC] will tend to have higher performance...ARC puts a bigger priority on performance". I'm surprised to read that when it is well known that reference counting is much slower than tracing garbage collection. flyingfrogblog.blogspot.co.uk/2011/01/…Fruitless
In theory GC is faster (each reference count manipulation has to be multiprocessor cache coherent, and there are a lot of them). In practice the only available GC system for ObjC is much slower. It is also extremely common for GC systems to pause threads at random times for user perceptible amounts of time (there are some realtime GC systems, but they are not common, and I think they have "interesting" constraints)Stereotomy
@Stripes: You don't need a real-time GC to avoid pauses. You only need an incremental GC which is 1970s technology.Fruitless
It might be 1970s technology, but in the 2010s companies still ship platforms that either don't use incremental GC, or still suffer from significant pauses. Apple shipped a GC system a mere 10 or so years ago that suffered from this, and still does (although they now tell you not to use it). Google has done the same (except they don't tell you not to use it, and you can't avoid it on Android...unless you bypass Java). In both companies had significant reasons to do it right if possible. Maybe it is harder then you think to satisfy all the constraints in a real world system?Stereotomy
Can you extend your answer by providing a concrete example (with code) where weak references fail to address retain cycles? What you wrote is unclear to me ("For example if you will sometimes hold onto B1, and expect to climb back up via the parent pointer and find A. With a weak reference if you only hold onto B1, A can (and normally will) evaporate, and take B2, and B3 with it.")Cupric
Exactly, with a weak reference holding a leaf node will not keep parents alive. With strong references all nodes will always be kept alive. With garbage collection holding a leaf is sufficient to keep the entire graph alive AND if you let go of the leaf GC will notice that the nodes only reference each other and release it all.Stereotomy
The weak references prevent retain cycles, but they don't let you have (for example) an object graph with parent & child pointers that keeps the whole graph alive until you have no external references.Stereotomy
D
5

Magic

But more specifically ARC works by doing exactly what you would do with your code (with certain minor differences). ARC is a compile time technology, unlike GC which is runtime and will impact your performance negatively. ARC will track the references to objects for you and synthesize the retain/release/autorelease methods according to the normal rules. Because of this ARC can also release things as soon as they are no longer needed, rather than throwing them into an autorelease pool purely for convention sake.

Some other improvements include zeroing weak references, automatic copying of blocks to the heap, speedups across the board (6x for autorelease pools!).

More detailed discussion about how all this works is found in the LLVM Docs on ARC.

Direction answered 17/6, 2011 at 13:8 Comment(3)
-1 "ARC is a compile time technology, unlike GC which is runtime and will impact your performance negatively". Reference counts are bumped at run-time which is very inefficient. That's why tracing GCs like the JVM and .NET are so much faster.Fruitless
@Jon: Do you have a proof of this ? From my own reading, it seems that new RC algorithms typically perform as well or better than M&S GC.Chantress
@xryl669: There is a full explanation in the GC Handbook (gchandbook.org). Note that tracing != M&S.Fruitless
E
3

It varies greatly from garbage collection. Have you seen the warnings that tell you that you may be leaking objects on different lines? Those statements even tell you on what line you allocated the object. This has been taken a step further and now can insert retain/release statements at the proper locations, better than most programmers, almost 100% of the time. Occasionally there are some weird instances of retained objects that you need to help it out with.

Electrostriction answered 17/6, 2011 at 12:49 Comment(0)
E
0

Very well explained by Apple developer documentation. Read "How ARC Works"

To make sure that instances don’t disappear while they are still needed, ARC tracks how many properties, constants, and variables are currently referring to each class instance. ARC will not deallocate an instance as long as at least one active reference to that instance still exists.

To make sure that instances don’t disappear while they are still needed, ARC tracks how many properties, constants, and variables are currently referring to each class instance. ARC will not deallocate an instance as long as at least one active reference to that instance still exists.

To know Diff. between Garbage collection and ARC: Read this

Eldoneldora answered 31/3, 2017 at 10:27 Comment(0)
B
-1

ARC is a compiler feature that provides automatic memory management of objects.

Instead of you having to remember when to use retain, release, and autorelease, ARC evaluates the lifetime requirements of your objects and automatically inserts appropriate memory management calls for you at compile time. The compiler also generates appropriate dealloc methods for you.

The compiler inserts the necessary retain/release calls at compile time, but those calls are executed at runtime, just like any other code.

The following diagram would give you the better understanding of how ARC works.

enter image description here

Those who're new in iOS development and not having work experience on Objective C. Please refer the Apple's documentation for Advanced Memory Management Programming Guide for better understanding of memory management.

Between answered 17/6, 2011 at 11:40 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.