Understanding a Large, Undocumented Set of Source Code? [closed]
Asked Answered
O

11

17

I have always been astonished by Wine. Sometimes I want to hack on it, fix little things and generally understand how it works. So, I download the Wine source code and right after that I feel overwhelmed. The codebase is huge and - unlike the Linux Kernel - there are almost no guides about the code.

What are the best-practices for understanding such a huge codebase?

Outbreak answered 26/3, 2009 at 1:53 Comment(0)
G
19

With a complex code base the biggest mistake you can make is trying to be a computer. Get the computer to run the code, and use a debugger to help find out what is going on.

  1. Figure out how to compile, install and run your own version of Wine from the existing source code.

  2. Learn how debug (e.g. use gdb) on a running instance of your version of Wine.

  3. Run Wine under the debugger and make cause it to demonstrate the undesired behaviour.

  4. The fun part: find where the code execution path goes and start learning how it all goes together.

Yes, reading lots and lots of code will help, but the compiler/debugger/computer can run code a lot faster than you.

Geographer answered 26/3, 2009 at 2:28 Comment(0)
G
11

A professor once told us to compare such a situation with climbing a mountain. You might be listening to someone who did this and tells you what it's like to look out into the country. And you believe without hesitation that that's a spectacular sight.

However, you have to start climbing yourself for real understanding what the view from the top is like.

And it's not that important to climb all the way to the top. It might be perfectly suficient just to reach a fair height above ground level.

But don't ever be afraid of start climbing. The view is always worth any efforts.


This has always been a nice analogy for me. I know this question was more about specific tips on how to efficiently deal with code bases once you started climbing. But nevertheless it instantly reminded me of our physics classes way back then.

Greedy answered 26/3, 2009 at 5:33 Comment(0)
P
8

(This is an answer I posted to a question a while back. I modified it a bit to fit this question.)

Experience has shown me that there are 3 major goals you have when learning a legacy system:

  1. Learn what the code is supposed to do.
  2. Learn how it does them.
  3. (crucially) Learn why it does them the way it does.

All three of those parts are very important, and there's a few tricks to help you get started.

First, resist the temptation to just ctrl-click (or whatever your IDE uses) your way around the code to understand everything. You probably won't be able to keep everything in perspective in your mind this way, especially when each line forces you to look at multiple other classes in order to understand what it is, so you need to be able to hold several levels of the stack in your head.

Read documentation where possible; it usually helps you quickly gain a mental framework upon which to build everything that follows.

Run test cases where possible.

Don't be afraid to ask someone who knows if you have a question. Granted, you shouldn't waste others' time with inane queries, but if there's something that you simply don't understand (this is especially true with more conceptual questions like, "Wouldn't it make much more sense to implement this as a ___" or something), it's probably worth finding out the answer before you mess something up and don't know why.

When you do finally get down to reading the code, start at a logical "main" place and go from there. Don't just read the code top to bottom, or in alphabetical order, or anything (this is probably obvious).

Playhouse answered 26/3, 2009 at 2:38 Comment(2)
Understanding the "whys" is the most challenging - at least for me.Outbreak
In a program like WINE, the "why" part is usually going to be "because that's what Windows does" right?Vermont
H
6

The best way to get acquainted with a large codebase is to dive in. Many projects have a list of easy tasks that need to be done, and they're usually reserved to help ease people in. You should find and work on some of these; you'll learn a lot about the general code outline and structure, contribute to the project, and get an easy payoff that will help encourage you to take on larger tasks.

Like most projects, WINE has good resources available to its developers; IRC, wiki, mailing list, and guides/overviews. With most daunting codebases, it's not so scary after the first few fixes. WINE is truly large, and much like the kernel, I doubt there's any expert in all systems; don't feel like you need to be either. Start working on something that matters to you and take it from there.

I've started a few patches to WINE myself, and it's a good community and good structure. There's lots of very helpful debug messages, and it's a really cool project to work on, so that helps you hit it longer too.

We all appreciate your valor and willingness to help with WINE (it needs it). Thanks, and good luck.

Hallagan answered 26/3, 2009 at 2:5 Comment(0)
B
4

Dig in. Think of a question you'd like to have answered, and try to find the answer. When you get tired of reading code, go read the dev mailing list, the developer's guide, or the wiki.

Unfortunately, there's no royal road to understanding a large code base. If you enjoy that sort of thing (I do) you're in for some fun. If not, guide books won't really help, so you aren't really that much worse off.

Background answered 26/3, 2009 at 1:56 Comment(0)
B
4

Look for one peculiar feature you are interested to improve. Search for its implementation. Once you found it, pull on that straw and all the rest will follow.

Backstay answered 26/3, 2009 at 2:3 Comment(0)
H
2

The best way is through comments. I'm being ironic, as you understand tiny bits of the beast add comments so you can follow your trail. The other developers will also enjoy it if you add the missing guides in the code.

Homage answered 26/3, 2009 at 2:7 Comment(1)
I always hesitate when I start doing this, when I'm learning about a codebase my understanding of it is lacking and any comments I have about the code will also be lacking. Adding comments for yourself, so temporarily is not a problem but leaving them in for others may cause more confusion.Cullet
T
2

As others have suggested, dig in! Read all the available documentation you can absorb. Then see if you can find other people who are interested or knowledgeable and learn with/from them. It helps to have people to bounce ideas off of and ask questions.

For C source code, once you get a feel for what areas of the code you'd like to work on, generate ctags and cscope databases for that code. These tools make it a lot easier to jump around and understand the code. Many text editors (one example is gvim) have support for ctags and cscope so you can jump around easily.

Tactful answered 26/3, 2009 at 2:11 Comment(0)
K
2

Try to implement some tiny little change in the code, something that will be visible to you. That might be figuring out a workable way to output debugging statements (and figuring out where the output appears), it might be changing the default size of windows or desktop color, or something. Once you can make something happen in the codebase, you've scratched the surface of understanding and can begin to move on toward more complicated things. At that point, select a goal of something slightly more useful that you'd like the code to do, and implement that. Or check out the project's bug tracker and look for something small to start with.

Document as you go, and write unit tests as you go, and refactor as you go. When you figure out what a routine does, comment it!!

Kofu answered 26/3, 2009 at 2:11 Comment(1)
I'd prefer to use the debugger-step method mentioned elsewhere, WHEN I can. Sometimes you can't. In that case "Kilroy was here" can be a useful act of desperation. "Is this even the code that is run to do feature X?"Cellarer
T
2

(warning: shameless marketing ahead)

For Java developers using Eclipse, there's nWire. It is an Eclipse plugin for navigating and visualizing large codebases.

Toodleoo answered 19/4, 2009 at 4:57 Comment(0)
I
1

A good way to understand a large system is to break it down into it's constituent parts and focus on a specific paths through the application.

Your debugger is your friend here, set a breakpoint in the thread you want to investigate then step through it line by line looking at which each part does... hope that helps...

Interfile answered 26/3, 2009 at 2:9 Comment(2)
Whoops is the stnadard to delete your own duplicate if someone gets to it just before you?Dissimulation
I have no idea what you're talking about...Interfile

© 2022 - 2024 — McMap. All rights reserved.