The reading list for scientific programmer [closed]
Asked Answered
T

17

50

I am working to become a scientific programmer. I have enough background in Math and Stat but rather lacking on programming background. I found it very hard to learn how to use a language for scientific programming because most of the reference for SP are close to trivial.

My work involves statistical/financial modelling and none with physics model. Currently, I use Python extensively with numpy and scipy. Done R/Mathematica. I know enough C/C++ to read code. No experience in Fortran.

I dont know if this is a good list of language for a scientific programmer. If this is, what is a good reading list for learning the syntax and design pattern of these languages in scientific settings.

Trombone answered 4/11, 2009 at 4:32 Comment(3)
What languages will you be using?Pontifex
@ James. Anything that do the job quickly(in prototyping) or efficiently. I am not constrained to almost anything but must be something readable by others.Trombone
"design pattern of these languages in scientific settings": this is the problem. Even the books which pretend to cover this stuff are usually bullsh*t. Learn C++ and let experience (and numerical recipes) teach you, or stick with R or Numpy (both are great).Mancino
H
39

At some stage you're going to need floating point arithmetic. It's hard to do it well, less hard to do it competently, and easy to do it badly. This paper is a must read:

What Every Computer Scientist Should Know About Floating-Point Arithmetic

Histrionism answered 4/11, 2009 at 8:24 Comment(0)
K
25

I thoroughly recommend

Scientific and Engineering C++: An Introduction with Advanced Techniques and Examples by Barton and Nackman

Don't be put off by its age, it's excellent. Numerical Recipes in your favourite language (so long as it is C,C++ or Fortran) is compendious, and excellent for learning from, not always the best algorithms for each problem.

I also like

Parallel Scientific Computing in C++ and MPI: A Seamless Approach to Parallel Algorithms and their Implementation by Karniadakis

The sooner you start parallel computing the better.

Katherinakatherine answered 4/11, 2009 at 8:58 Comment(3)
Do not, under any circumstances, use Numerical Recipes to try to learn a programming language.Supplicatory
Shit, too late, by about 25 years. Oh, what a wasted life. And I stand by my comment that NR is an excellent text for learning scientific programming, which is about a lot more than a programming language.Katherinakatherine
Numerical Recipes was ok 25 years ago but it is a joke today.Thionic
S
11

My first suggestion is that you look at the top 5 universities for your specific field, look at what they're teaching and what the professors are using for research. That's how you can discover the relevant language/approach.

Also have a look at this stackoverflow question ("practices-for-programming-in-a-scientific-environment").

You're doing statistical/finance modeling? I use R in that field myself, and it is quickly becoming the standard for statistical analysis, especially in the social sciences, but in finance as well (see, for instance, http://rinfinance.com). Matlab is probably still more widely used in industry, but I have the sense that this may be changing. I would only fall back to C++ as a last resort if performance is a major factor.

Look at these related questions for help finding reading materials related to R:

In terms of book recommendations related to statistics and finance, I still think that the best general option is David Ruppert's "Statistics and Finance" (you can find most of the R code here and the author's website has matlab code).

Lastly, if your scientific computing isn't statistical, then I actually think that Mathematica is the best tool. It seems to get very little mention amongst programmers, but it is the best tool for pure scientific research in my view. It has much better support for things like integration and partial differential equations that matlab. They have a nice list of books on the wolfram website.

Salic answered 4/11, 2009 at 5:27 Comment(0)
H
10

In terms of languages, I think you have a good coverage. Python is great for experimentation and prototyping, Mathematica is good for helping with the theoretical stuff, and C/C++ are there if you need to do serious number crunching.

I might also suggest you develop an appreciation of an assembly language and also a functional language (such as Haskell), not really to use, but rather because of the effect they have on your programming skills and style, and of the concepts they bring home to you. They might also come in handy one day.

I would also consider it vital to learn about parallel programming (concurrent/distributed) as this is the only way to access the sort of computing power that sometimes is necessary for scientific problems. Exposure to functional programming would be quite helpful in this regard, whether or not you actually use a functional language to solve the problem.

Unfortunately I don't have much to suggest in the way of reading, but you may find The Scientist and Engineer's Guide to Digital Signal Processing helpful.

Hudibrastic answered 4/11, 2009 at 4:59 Comment(5)
I have strong appreciation of Haskell :)Trombone
In that case, learn assembly language. IMO the best way to do that is to write a toy kernel in assembly language, because you'll learn a million things besides.Hudibrastic
Oh yeah, and there's always The Art of Computer Programming (by Knuth)Hudibrastic
You will learn a million things by learning assembler, but that's something like saying to learn biology, study physics first. Sure you'll learn a ton, but (a) not everyone needs to understand everything about how computers or software work deep down (though more general knowledge is a fine thing to have), and (b) there are other paths more immediately applicable to his field of inquiry that could also provide much insight.Hirza
@mlimber: it's a matter of opinion. Note that I used "suggest" and "IMO" about this issue. The OP should choose something that suits him.Hudibrastic
T
7

I'm a scientific programmer who just entered the field in the past 2 years. I'm into more biology and physics modeling, but I bet what you're looking for is pretty similar. While I was applying to jobs and internships there were two things that I didn't think would be that important to know, but caused me to end up missing out on opportunities. One was MATLAB, which has already been mentioned. The other was database design -- no matter what area of SP you're in, there's probably going to be a lot of data that has to be managed somehow.

The book Database Design for Mere Mortals by Michael Hernandez was recommended to me as being a good start and helped me out a lot in my preparation. I would also make sure you at least understand some basic SQL if you don't already.

Trainband answered 17/12, 2009 at 23:48 Comment(0)
P
6

I would suggest any of the numerical recipes books (pick a language) to be useful.

Depending on the languages you use or if you will be doing visualization there can be other suggestions.

Another book I really like is Object-Oriented Implementation of Numerical Methods, by Didier Besset. He shows how to do many equations in Java and smalltalk, but what is more important is that he does a fantastic job with helping to show how to optimize equations for use on a computer and how to deal with errors because of limitations on the computer.

Pontifex answered 4/11, 2009 at 5:6 Comment(2)
+1 for Besset. NR books need to be taken with a grain of salt--code is awful, though usually functional.Lasso
I will never forgive NR (even 3rd ed, 2007) for advising people to pad signals with zeroes up to a power of two. So much work ruined... :-(Thionic
G
4

Donald Knuth's book on seminumerical algorithms.

Greggrega answered 4/11, 2009 at 10:25 Comment(0)
H
4

MATLAB is widely used in engineering for design, rapid development, and even production applications (my current project has a MATLAB-generated DLL for doing some advanced number crunching that was easier to do than in our native C++, and our FPGAs use MATLAB-generated cores for signal processing too, which is much easier than coding the same by hand in VHDL). There's also a financial toolbox for MATLAB that may be of interest to you.

This is not to say that MATLAB is the best choice for your field, but at least in engineering, it's widely used and not going anywhere soon.

Hirza answered 4/11, 2009 at 19:42 Comment(0)
R
4

One issue scientific programmers face is maintaining a repository of code (and data) that others can use to reproduce your experiments. In my experience this is a skill not required in commercial development.

Here are some readings on this:

These are in the context of computational biology but I assume it applies to most scientific programming.

Also, look at Python Scripting for Computational Science.

Relucent answered 10/12, 2009 at 14:20 Comment(0)
T
3

For generic C++ in scientific enviroments, Modern C++ Design by Andrei Alexandrescu is probably the standard book about the common design patterns.

Teilo answered 4/11, 2009 at 5:10 Comment(3)
MC++D is a fantastic book, but it's not for C++ beginners like the OP, nor is it any more useful for specifically scientific applications than is the GoF's original Design Patterns. If you don't know how to write your own template classes and functions and partially specialize them, for instance, you'll need a firmer grounding in the language before picking up this book.Hirza
I don't know about the specific needs of the OP, but for "design patterns in [some] scientific enviroments" its a valuable foundation imo. Some lab-teams here see it as the initial must-read, thats why i brought it up.Teilo
This book contains some esoteric C++ constructs; best fit for library design with C++ templates. It is a bit dated due to modern features as perfect forwarding and variadic templates. It does not contain information about numerical methods, modeling and software architecture.Woadwaxen
A
2

Once you are up and running, I would strongly recommend reading this blog.

It describes how you use C++ templates to provide type safe units. So for example, if you multiply velocity by time you get a distance etc.

Angary answered 4/11, 2009 at 10:20 Comment(1)
You might also be interested in "units of measure" in Microsoft's new F# programming language.Thionic
R
2

Reading source-code helps a lot, too. Python is great in this sense. I have learnt a great amount of information just by digging through the source codes of scientific Python tools. On top of this following your favourite tools' mailing-lists and forums can enhance your skills further.

Relations answered 27/12, 2009 at 4:48 Comment(0)
G
0

this might be useful: the nature of mathematical modeling

Granddaddy answered 4/11, 2009 at 5:22 Comment(0)
E
0

Donald Knuth: Seminumerical Algorithms, Volume 2 of The Art of Computer Programming

Press, Teukolsky, Vetterling, Flannery: Numerical Recipes in C++ (the book is great, just beware of the license)

Modern C++ Design

and have a gander at the source code for the GNU Scientific Library.

Edmundson answered 10/12, 2009 at 14:23 Comment(1)
The license... and the awful code and advise.Thionic
A
0

Writing Scientific Software: A Guide to Good Style is a good book with overall advice for modern scientific programming.

Ascogonium answered 20/5, 2010 at 18:13 Comment(0)
S
-1

For Java I recommend a look at Unit-API
Implementations are Eclipse UOMo (http://www.eclipse.org/uomo) or JScience.org (work in progress for Unit-API, earlier implementations of JSR-275 exist)

Servomotor answered 23/1, 2011 at 12:9 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.