How to define whole line comment syntax in Emacs?
Asked Answered
M

3

8

I want the sequence // to start a comment when it is at the beginning of a line. But inside of a line it should not start any comments.

// this is a comment
This is a URL: http://example.com

Is it possible?

Monitorial answered 11/8, 2014 at 14:16 Comment(4)
Hi, what language is it for ?Frizzly
Why not possible ? To check for the beginning of a line there is the regexp ^//.Frizzly
@Frizzly Emacs uses syntax tables to handle comments (especially highlighting). It is not possible to use regexps there. Is there any other way to specify a comment syntax in Emacs?Monitorial
@Monitorial Just an update that using regexps is now possible for syntax tables. See stefan's answer below which uses (syntax-propertize-functions) and (syntax-propertize-rules).Heterosis
A
6

You can do this by writing a syntax-propertize-function

I have written an example major mode that shows this below. Emacs built in parsing will call your syntax-propertize-function so that it can manually set the syntax-table text property on lines starting with //.

(define-derived-mode my-syntax-test-mode fundamental-mode
  "A major mode where // denotes a comment but only if it is at the beginning of a line."
  :syntax-table (make-syntax-table) 
  (setq mode-name "my syntax test")
  ;; our mode will use `apply-my-custom-syntax-table-appropriately' to manually set
  ;; the syntax-table text property on lines starting with //"
  (setq syntax-propertize-function 'apply-my-custom-syntax-table-appropriately)
  ;; change `comment-dwim` to handle this type of comments correctly
  (local-set-key [remap comment-dwim] 'my-comment-dwim))

(defvar my-custom-syntax-table
  ;; syntax table where // starts a comment and \n ends it
  (let ((table (make-syntax-table)))
    (modify-syntax-entry ?/ "< 1" table)
    (modify-syntax-entry ?/ "< 2" table)
    (modify-syntax-entry ?\n "> " table)
    table))

(defun apply-my-custom-syntax-table-appropriately (beg end)
  (save-excursion
    (save-restriction
      (widen)
      (goto-char beg)
      ;; for every line between points BEG and END
      (while (and (not (eobp)) (< (point) end))
        (beginning-of-line)
        ;; if it starts with a //
        (when (looking-at "^//")
          ;; remove current syntax-table property
          (remove-text-properties (1- (line-beginning-position))
                                  (1+ (line-end-position))
                                  '(syntax-table))
          ;; set syntax-table property to our custom one
          ;; for the whole line including the beginning and ending newlines
          (add-text-properties (1- (line-beginning-position))
                               (1+ (line-end-position))
                               (list 'syntax-table my-custom-syntax-table)))
        (forward-line 1)))))

(defun my-comment-dwim (arg)
  (interactive "*P")
  (require 'newcomment)
  (save-excursion
    (let ((comment-start "//") (comment-end "")
          (comment-column 0)
          ;; don't indent comments
          (comment-style 'plain))
      ;; create the region containing current line if there is no active region
      (unless (use-region-p)
        (end-of-line)
        (push-mark (line-beginning-position))
        (setq mark-active t))
      (comment-dwim nil))))
Acetylide answered 11/8, 2014 at 19:45 Comment(4)
I'm not sure about using font-lock here, because it syntax table also defines the highlighting rules.Monitorial
Not sure about the syntax table being used for highlighting, if you take the font lock line out of there, you will get no comment highlighting in the mode. However, maybe there is a more correct way to do it.Acetylide
I have removed this line, and the mode still provides the highlighting (Emacs 24.3.1).Monitorial
Hmm, it seems I may be running into something else interfering, without the line my will highlight but only after I disable and reenable font-lock-mode. Looks like you likely don't need it, very cool.Acetylide
A
7

I'd do it this way:

(defvar my-foo-mode-syntax-table
  (let ((st (make-syntax-table)))
    ;; Add other entries appropriate for my-foo-mode.
    (modify-syntax-entry ?/ ". 12" st)
    (modify-syntax-entry ?\n "> " st)
    st))

(defvar my-foo-font-lock-keywords
  ;; Add other rules appropriate for my-foo-mode.
  ())

(define-derived-mode my-foo-mode nil "My-Foo"
  (setq-local font-lock-keywords '(my-foo-font-lock-keywords))
  ;; Add other settings appropriate for my-foo-mode.
  (setq-local syntax-propertize-function
              (syntax-propertize-rules ("./\\(/+\\)" (1 ".")))))

Notice: No need for any special font-lock rule since font-lock automatically highlights comments for you, based on the syntax-tables.

Assuming answered 12/8, 2014 at 13:44 Comment(10)
Great idea, but it doesn't work for me. Comments highlighting just vanished.Monitorial
My crystal ball tells me you're not using Emacs-24.4, so you must have received some error message about setq-local not existing.Assuming
Sorry, it was just my mistake.Monitorial
I suggest changing the regexp to [^/]\\(//\\) to allow comments starting by multiple /.Monitorial
@Menschenkindlein: Good point, tho I fixed it slightly differently.Assuming
Where do you use my-foo-mode-syntax-table in my-foo-mode? Is it not necessary?Ashelman
@ceving: define-derived-mode will automatically use it (by adding -syntax-table to the major mode's name). Same thing for the my-foo-mode-map keymap.Assuming
Why did you use modify-syntax-table to create a two-character comment? Wouldn't syntax-propertize-function have worked on its own if you had used "<" (begin comment) instead of "." (symbol) for the syntax?Heterosis
Update from my previous comment: I tried out what I suggested and it worked so I created a simpler answer. See below.Heterosis
I used a two-char comment in syntax-tables plus a syntax-propertize rule to weed out the false positives to try and optimize the syntax-table to the most common (expected) case, because syntax-tables are a bit more efficient. Depending on other factors, it may indeed be preferable to use syntax-propertize to correct false negatives instead.Assuming
A
6

You can do this by writing a syntax-propertize-function

I have written an example major mode that shows this below. Emacs built in parsing will call your syntax-propertize-function so that it can manually set the syntax-table text property on lines starting with //.

(define-derived-mode my-syntax-test-mode fundamental-mode
  "A major mode where // denotes a comment but only if it is at the beginning of a line."
  :syntax-table (make-syntax-table) 
  (setq mode-name "my syntax test")
  ;; our mode will use `apply-my-custom-syntax-table-appropriately' to manually set
  ;; the syntax-table text property on lines starting with //"
  (setq syntax-propertize-function 'apply-my-custom-syntax-table-appropriately)
  ;; change `comment-dwim` to handle this type of comments correctly
  (local-set-key [remap comment-dwim] 'my-comment-dwim))

(defvar my-custom-syntax-table
  ;; syntax table where // starts a comment and \n ends it
  (let ((table (make-syntax-table)))
    (modify-syntax-entry ?/ "< 1" table)
    (modify-syntax-entry ?/ "< 2" table)
    (modify-syntax-entry ?\n "> " table)
    table))

(defun apply-my-custom-syntax-table-appropriately (beg end)
  (save-excursion
    (save-restriction
      (widen)
      (goto-char beg)
      ;; for every line between points BEG and END
      (while (and (not (eobp)) (< (point) end))
        (beginning-of-line)
        ;; if it starts with a //
        (when (looking-at "^//")
          ;; remove current syntax-table property
          (remove-text-properties (1- (line-beginning-position))
                                  (1+ (line-end-position))
                                  '(syntax-table))
          ;; set syntax-table property to our custom one
          ;; for the whole line including the beginning and ending newlines
          (add-text-properties (1- (line-beginning-position))
                               (1+ (line-end-position))
                               (list 'syntax-table my-custom-syntax-table)))
        (forward-line 1)))))

(defun my-comment-dwim (arg)
  (interactive "*P")
  (require 'newcomment)
  (save-excursion
    (let ((comment-start "//") (comment-end "")
          (comment-column 0)
          ;; don't indent comments
          (comment-style 'plain))
      ;; create the region containing current line if there is no active region
      (unless (use-region-p)
        (end-of-line)
        (push-mark (line-beginning-position))
        (setq mark-active t))
      (comment-dwim nil))))
Acetylide answered 11/8, 2014 at 19:45 Comment(4)
I'm not sure about using font-lock here, because it syntax table also defines the highlighting rules.Monitorial
Not sure about the syntax table being used for highlighting, if you take the font lock line out of there, you will get no comment highlighting in the mode. However, maybe there is a more correct way to do it.Acetylide
I have removed this line, and the mode still provides the highlighting (Emacs 24.3.1).Monitorial
Hmm, it seems I may be running into something else interfering, without the line my will highlight but only after I disable and reenable font-lock-mode. Looks like you likely don't need it, very cool.Acetylide
H
0

Answer: Use regexps

Short and simple

(define-derived-mode my-foo-mode prog-mode "My-Foo"
  (setq-local font-lock-keywords t)
  (setq-local syntax-propertize-function
              (syntax-propertize-rules
               ((rx line-start (* whitespace) (group "//"))  (1 "<"))
               ((rx (group "\n"))  (1 ">")))))

That is all you need, but read on if you'd like to know more.

Explanation

This is based on @stefan's excellent solution which uses syntax-propertize-function to add to a syntax-table. While simpler isn't always better, @stefan's answer does more than what the original question asked for, so I've created this answer for people who only need a small hint or who just want to modify an existing mode.

It turns out directly manipulating a syntax table is unnecessary since the function syntax-propertize-rules makes it easy to map from regular expressions to syntax classes. For example, the syntax class < means "start of comment" and > means "end of comment". (See the Emacs lisp manual.)

I set font-lock-keywords to t as that is the minimum needed to enable syntax highlighting. If you are editing an existing mode, it likely already sets that variable and will not need to be changed.

And, finally, I use Emacs' rx function because it makes regular expressions sane in Lisp. (If you like Lisp, regular expressions, and sanity, I highly recommend using rx.)


About syntax-propertize-rules

I was going to link to the documentation for syntax-propertize-rules, but the Emacs manual does not (as of Emacs 28.1) even mention it. Until that gets remedied, I'll paste here the builtin documentation from C-hf:

syntax-propertize-rules is a Lisp macro in ‘syntax.el’.

(syntax-propertize-rules &rest RULES)

Probably introduced at or before Emacs version 24.1.

Make a function that applies RULES for use in ‘syntax-propertize-function’. The function will scan the buffer, applying the rules where they match. The buffer is scanned a single time, like "lex" would, rather than once per rule.

Each RULE can be a symbol, in which case that symbol’s value should be, at macro-expansion time, a precompiled set of rules, as returned by ‘syntax-propertize-precompile-rules’.

Otherwise, RULE should have the form (REGEXP HIGHLIGHT1 ... HIGHLIGHTn), where REGEXP is an expression (evaluated at time of macro-expansion) that returns a regexp, and where HIGHLIGHTs have the form (NUMBER SYNTAX) which means to apply the property SYNTAX to the chars matched by the subgroup NUMBER of the regular expression, if NUMBER did match. SYNTAX is an expression that returns a value to apply as ‘syntax-table’ property. Some expressions are handled specially:

  • if SYNTAX is a string, then it is converted with ‘string-to-syntax’;
  • if SYNTAX has the form (prog1 EXP . EXPS) then the value returned by EXP will be applied to the buffer before running EXPS and if EXP is a string it is also converted with ‘string-to-syntax’. The SYNTAX expression is responsible to save the ‘match-data’ if needed for subsequent HIGHLIGHTs. Also SYNTAX is free to move point, in which case RULES may not be applied to some parts of the text or may be applied several times to other parts.

Note: back-references in REGEXPs do not work.

Heterosis answered 4/9, 2022 at 7:39 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.