Why does Tex/Latex not speed up in subsequent runs?
Asked Answered
B

6

15

I really wonder, why even recent systems of Tex/Latex do not use any caching to speed up later runs. Every time that I fix a single comma*, calling Latex costs me about the same amount of time, because it needs to load and convert every single picture file.

(* I know that even changing a tiny comma could affect the whole structure but of course, a well-written cache format could see the impact of that. Also, there might be situations where 100% correctness is not needed as long as it’s fast.)

Is there something in the language of Tex which makes this complicated or impossible to accomplish or is it just that in the original implementation of Tex, there was no need for this (because it would have been slow anyway on those large computers)?

But then on the other hand, why doesn’t this annoy other people so much that they’ve started a fork which has some sort of caching (or transparent conversion of Tex files to a format which is faster to parse)?

Is there anything I can do to speed up subsequent runs of Latex? Except from putting all the stuff into chapterXX.tex files and then commenting them out?

Busybody answered 28/4, 2010 at 0:49 Comment(0)
S
8

Let's try to understand how TeX works. What happens when you write the following?

tex.exe myfile.tex

TeX reads your file byte by byte. First of all, TeX converts each char to pair <category, ascii-code>. Each character has category code and ascii code. Category code means that the character is an opening brace ({) or entrance into the mathematical mode ($), symbol-macro (~, for example) or letter (A-Z,a-z).

If TeX gets chars with category code 11 (letters) or 12 (other symbols: digits, comma, period) TeX starts a paragraph. You want to cache all paragraphs.

Suppose you changed something in your document. How can TeX check that all paragraphs after your changes is the same? May be you changed the category of some char. Me be you changed the meaning of some macro. Or you have removed } somewhere and thus changed the current font.

To be sure that the paragraph is the same you must be sure that all characters in the paragraph is the same, that all character categories is the same, the current font is the same, all math fonts is the same, and the value of some internal variables is the same (for example, \hsize, \vsize, \pretolerance, \tolerance, \hypenpenalty, exhyphenpenalty, \widowpenalty, \spaceskip, ..., ........)

You can be sure only that all paragraphed before your changes is the same. But in this case you must keep all states after each paragraph.

Your system SuperCachedTeX is very complicated. Isn't it?

Saberio answered 28/4, 2010 at 11:50 Comment(3)
If you restrict the problem and can assume that there is no whatever redefinition, catcode change etc, then caching, for example at the chapter level, could be done. You don't get guaranteed output, but the OP is not asking for a perfect solution. What do you think?Marga
@Patrick, I think it is unnecessary complication. My 200-page paper compiled in 2–3 seconds.Saberio
In my experience the graphics are what really slow things down. I've got 100 page manual that compiles in about second if I pass [draft] to {graphicx}, otherwise it takes about a minute. As graphics and their parameters seldom change in my work flow a simple fix would be to cache the images AND optimise the graphics handling, in this day and age the graphic/image handling should really be a breeze. And I'm running 2.8 GHz, 8 GB, 512 GB so it s not like this is a slow machine.Genitourinary
M
4

If you're using pdftex, then you can use --draftmode on the command line for the first runs. This instructs pdftex not to generate a PDF.

Of course lots of things could be cached (like graphics information, for instance), but the way TeX works makes it hard to do. There is a rather complex initialization of TeX when it starts up, and one TeX run always means exactly one PDF written out. In order to do caching, you need to keep the data in memory (to be efficient).

You could use IPC and talk to a daemon to get the cached information. But that would involve lots programming. TeX is for normal purposes so blazingly fast, that this does not really gain a lot. But on the other hand, this is a good question, as I have seen LaTeX runs (on currend hardware) that run > 10 hours that would have benefited from caching.

Marga answered 28/4, 2010 at 7:49 Comment(7)
But --draftmode doesn’t really help. Eventually, this means that the whole process will take even more time, because I’ll have to run it once more without --draftmode. And I think, with a proper caching and better internal processing, there wouldn’t be a need for that switch anyway, because Tex wouldn’t need to run three times in order to produce a file with footnotes and bibliography.Busybody
Remember that once a page is shipped out it can't be altered (page breaking, etc., is a local optimisation). There are interesting efforts to use LuaTeX to avoid multiple TeX runs, but at present you'd have to be using ConTeXt Mark IV to see them.Krick
@Joseph: that is not true in all cases. You can use postscript specials that do some replacement of text and add the postscript header afterwards. This sounds obscure, but I had cases where this would have been really helpful (saving a few hours of TeX can save a lot of money). But that is not for novice users.Marga
Sure, the page can’t be altered. But the caching could estimate that only pages n to nn need a change because there is a hard page break before page nn+1 and the comma wouldn’t affect this. Also, it would make some sense that Tex had an ‘non-strict mode’, e.g. a mode which would just extend the line where the comma should fit in, thus not acting on any other lines. Still, the output would be nicer and easier to e.g. proof-read than the original Tex. — Of course for the final runs and when you care about bad boxes and such, you should not use the non-strict mode.Busybody
@Joseph: are you talking about using Texlib to avoid reinvoking Tex/Metapost from each other?Hemp
@Charlie: I was thinking more about the fact that ConTeXt Mark IV avoids the need for things like .aux files, so the multiple TeX runs you need with LaTeX can be bypassed.Krick
@Joeseph, @ConTeXt also uses .aux files. (.tuc). Or what do I miss?Marga
M
3

Yet another answer, not strictly related:

You can use the LaTeX macro \include{...} and with \includeonly{} you can rerun your document for a subset only. But this is not caching, nor does it give you the complete document.

Marga answered 28/4, 2010 at 12:18 Comment(0)
K
2

There are solutions such as preview-latex, which pre-compile stuff into a dedicated format file for speed purposes. You need to remember that TeX optimises pages on a local basis. There is no concpet at the engine level of material being fixed on a particular page, so you can't just "re-TeX one page".

Krick answered 28/4, 2010 at 6:10 Comment(0)
M
1

Actually, the correct answer is (IMO): LaTeX already caches information in its output file (.aux, additional files for other packages). So if you add a comma, this information is reused and thus the typeset run is much faster then without this .aux file.

Marga answered 28/4, 2010 at 10:33 Comment(3)
That’s not really what I would consider caching. As far as I understand it, the .aux file is just providing default values for some of the numbered items. Without the file, the numbers and labels would be wrong but it wouldn’t be slower.Busybody
Let's assume that a single LaTeX run is 60 seconds, in order to get your correct document, it takes 3 runs, that is 3 minutes. If you change a comma, then the time for getting the correct document is only 1/3 of the time. But there are different ways of caching, of course.Marga
This is not caching. Caching is additional state to make code run faster; Tex uses auxiliary files as a primary state representation. A test: would the computation run just the same without the state? If not, it is not caching.Hemp
H
1

Tex does have a caching facility, named format files, and I think, pace Alexey's valuable summary of the problems representing Tex's state, it should be possible to use them to allow resumption of editing after any page eject.

The major issue is that pagebreaks will affect paragraphs or floats, and these may not occur at a particular point in the text, but may be occur in the execution of macros that were invoked dependent on the transient state passed to them when they were invoked.

So to make the idea of creating "breakpoints" work, one would need to hack Tex internals to dump additional information, beyond that normaally dumped in format files, and package them up with the state of the auxiliary files. Given what Joseph says about Tex fragment previewers, why would anyone bother hacking Tex to do this?

Hemp answered 30/4, 2010 at 7:50 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.