Safe Parsing of Format Directives in Common Lisp
Asked Answered
W

2

6

I would like to read in a string from an input file (which may or may not have been modified by the user). I would like to treat this string as a format directive to be called with a fixed number of arguments. However, I understand that some format directives (particularly, the ~/ comes to mind) could potentially be used to inject function calls, making this approach inherently unsafe.

When using read to parse data in Common Lisp, the language provides the *read-eval* dynamic variable which can be set to nil to disable #. code injection. I'm looking for something similar that would prevent code injection and arbitrary function calls inside format directives.

Weinman answered 10/9, 2015 at 2:36 Comment(0)
K
4

If the user cannot introduce custom code but only format strings, then you can avoid the problems of print-object. Remember to use with-standard-io-syntax (or a customized version of it) to control to exact kind of output you will generate (think about *print-base*, ...).

You can scan the input strings to detect the presence of ~/ (but ~~/ is valid) and refuse to interpret format that contains blacklisted constructs. However, some analysis are more difficult and you might need to act at runtime.

For example, if the format string is malformed, you will probably encouter an error, which must be handled (also, you may give bad values to the expected arguments).

Even if the user is not malicious, you can also have problems with iteration constructs:

~{<X>~:*~}

... never stops because ~:* rewinds current argument. In order to handle this, you must consider that <X> may, or not, print something. You could implement both of those strategies:

  • have a timeout to limit the time formatting takes
  • have the underlying stream reach end-of-file when writing too much (e.g. write into a string buffer).

There might be other problems I currently don't see, be careful.

Knighthead answered 10/9, 2015 at 21:43 Comment(3)
Thanks. I hadn't even thought about infinite looping problems. With problems like these, I'll probably end up writing my own micro-language for format specifiers rather than trying to tweak format.Weinman
@SilvioMayolo This is a good approach too. But take some time to see if other formats exists: I am thinking about cl-interpol but there might be others. Good luck!Knighthead
@JoshuaTaylor I learned about this while golfing :-)Knighthead
M
3

It's not just ~/ that you'd need to worry about. The pretty printer functionality has lots of possibilities for code extension, and even ~A can cause problems, because objects may have methods on print-object defined. E.g.,

(defclass x () ())

(defmethod print-object ((x x) stream)
  (format *error-output* "Executing arbitrary code...~%")
  (call-next-method x stream))

CL-USER> (format t "~A" (make-instance 'x))
Executing arbitrary code...
#<X {1004E4B513}>
NIL

I think you'd need to define for yourself which directives are safe, using whatever criteria you consider important, and then include only those.

Machiavellian answered 10/9, 2015 at 20:4 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.