Finally happy w/ indentation in Emacs Python mode

Tim Peters (tim@ksr.com)
Sun, 09 Feb 92 16:23:45 EST

New version attached. Give it a try, gripe at will.

C-c r has gone away, and an extended version of its functionality merged
into C-c TAB. The mode blurb (C-h m) explanation of indentation has
been completely rewritten; read it, and also do `C-h k C-c TAB' to learn
about C-c TAB's fine points.

> [a funny guido .sig]
> "A quick survey of the literature will show that there are hundreds of
> language features to support abstract data types, but only one
> example--the stack" --Mark R. Brown and Greg Nelson

C-c TAB is now smart enough to bail you out after pasting together
Python code from a variety of sources using a variety of indentation
conventions. E.g. `C-x h C-c TAB' will change this:

>>> BEGIN CONTRIVED EXAMPLE

EmptyStack = 'attempt to access empty stack'

class Stack:

def init(s):
s.data = []
return s

def push(s,v): s.data.append(v)

def top(s):
if len(s.data) = 0:
raise \
EmptyStack
else:
return s.data[-1]

def pop(s):
top = s.top()
del s.data[-1:]
return top

>>> END CONTRIVED EXAMPLE

into this (when py-indent-offset is 4, or 4 is given as a prefix arg to
C-c TAB):

>>> BEGIN TRANSFORMED CONTRIVED EXAMPLE

EmptyStack = 'attempt to access empty stack'

class Stack:

def init(s):
s.data = []
return s

def push(s,v): s.data.append(v)

def top(s):
if len(s.data) = 0:
raise \
EmptyStack
else:
return s.data[-1]

def pop(s):
top = s.top()
del s.data[-1:]
return top

>>> END TRANSFORMED CONTRIVED EXAMPLE

The meaning of a prefix argument to C-C TAB has also changed: it used
to mean "shift everything in the region by that amount"; now it's used
as a temporary value for py-indent-offset. C-c < and C-c > are the only
remaining ways to shift a region rigidly; C-c TAB now always
"normalizes" region indentation.

it's-easier-to-use-than-to-explain-ly y'rs - tim

Tim Peters Kendall Square Research Corp
tim@ksr.com, ksr!tim@uunet.uu.net

;;; Major mode for editing Python programs.
;; by: Michael A. Guravage and Guido van Rossum <guido@cwi.nl>
;;
;; The following statements, placed in your .emacs file or site-init.el,
;; will cause this file to be autoloaded, and python-mode invoked, when
;; visiting .py files (assuming the file is in your load-path):
;;
;; (autoload 'python-mode "python-mode" "" t)
;; (setq auto-mode-alist
;; (cons '("\\.py$" . python-mode) auto-mode-alist))

;;; Change log:
;
; Sun Feb 9 02:35:11 1992 tim
; changed py-reindent-top-level-region so that it isn't confused by
; seeing a mixture of different indentation styles in the region
; fixed potential infinite loop in py-reindent-top-level-region (if
; the region doesn't end with a newline, (forward-line 1) always
; returns a success value)
; merged py-shift-region and py-reindent-top-level-region and renamed
; the combo py-indent-region, bound to C-c TAB
; completely rewrote the mode blurb's explanation of indentation
;
; Fri Feb 7 17:18:11 1992 tim
; added support for reindenting a region with a different value for
; the indentation offset; new function py-reindent-top-level-region;
; new docs; bound to C-c r.
; replaced the hard-coded tabs in the mode blurb with "\t"
;
; Fri Feb 7 01:13:08 1992 tim
; taught py-compute-indentation about blank & comment lines:
; accept comment line indentation as-is
; skip blank & comment lines when looking for the indentation of
; the preceding statement
; change py-shift-region docs accordingly
; reversed order of change log (now most-recent first)
; changed "\C-c\C-i" to "\C-c\t" for clarity
; rearranged mode-map & syntax-table code to match other language modes
; taught py-continuation-line-p that a trailing backslash does not
; indicate continuation when it ends a comment line
; ditto py-goto-initial-line
; fixed some pathological cases; e.g.,
; if \
; 1 \
; :\
; \
; \
; # continued comment
; print 'I should be indented'
; if a == b : # \
; print 'I should be indented, but not like a continuation'
; added consts py-stringlit-re and py-continued-re to support the above;
; made py-colon-line-re even hairier
;
; Wed Feb 5 21:52:15 1992 tim
; changed new doc strings so first line makes sense on its own
; new function py-shift-region; bound to C-c TAB
; new function py-shift-region-left; bound to C-c <
; new function py-shift-region-right; bound to C-c >
; reorganized mode blurb
; sped up py-continuation-line-p
;
; Wed Feb 5 03:23:31 1992 tim
; added support for auto-indenting of continuation lines:
; new vrbl py-continuation-offset
; new function py-continuation-line-p
; new function py-goto-initial-line
; new function py-compute-indentation
; rewrote py-indent-line
; changed py-indent-line to refrain from modifying the buffer if the
; indentation is already correct
; hid the hairy colon-line regexp in a const
; changed indent-region example to use legal Python
; documented all that
;
; Mon Feb 3 20:37:27 1992 tim
; renamed file to 'python-mode.el' for consistency with other Emacs
; language modes; changed autoload instructions accordingly
; improved accuracy of new/changed docs
; changed py-python-command from defconst to defvar so .emacs can
; override it if desired
; add warning about indent-region; suggest indent-rigidly
;
; Sun Feb 2 21:48:59 1992 tim
; renamed 'python-indent' to 'py-indent-offset' for internal consistency
; replaced regexp in py-indent-line so it no longer indents after
; lines like:
; a = b # ok: # found
; a = ': #'
;
; Sun Feb 2 02:08:47 1992 tim
; added support for user-defined indentation increment:
; added python-indent variable
; changed py-indent-line to indent python-indent columns
; fixed small bug in py-indent-line (changed start of r.e.
; from [^#] to [^#\n])
; added py-delete-char function; bound to \177
; changed mode blurb accordingly

(provide 'python)

;;; Constants and variables

(defvar py-python-command "python"
"*UNIX shell command used to start Python interpreter")

(defvar py-indent-offset 4
"*Indentation increment in Python mode")

(defvar py-continuation-offset 2
"*Indentation (in addition to py-indent-offset) for continued lines")

(defvar py-mode-map nil "Keymap used in Python mode buffers")
(if py-mode-map
()
(setq py-mode-map (make-sparse-keymap))
(define-key py-mode-map "\C-c\C-c" 'py-execute-buffer)
(define-key py-mode-map "\C-c|" 'py-execute-region)
(define-key py-mode-map "\C-c!" 'py-shell)
(define-key py-mode-map "\177" 'py-delete-char)
(define-key py-mode-map "\C-c\t" 'py-indent-region)
(define-key py-mode-map "\C-c<" 'py-shift-region-left)
(define-key py-mode-map "\C-c>" 'py-shift-region-right))

(defvar py-mode-syntax-table nil "Python mode syntax table")
(if py-mode-syntax-table
()
(setq py-mode-syntax-table (make-syntax-table))
(set-syntax-table py-mode-syntax-table)
(modify-syntax-entry ?\( "()")
(modify-syntax-entry ?\) ")(")
(modify-syntax-entry ?\[ "(]")
(modify-syntax-entry ?\] ")[")
(modify-syntax-entry ?\{ "(}")
(modify-syntax-entry ?\} "){")
(modify-syntax-entry ?\_ "w")
(modify-syntax-entry ?\' "\"") ; single quote is string quote
(modify-syntax-entry ?\` "$") ; backquote is open and close paren
(modify-syntax-entry ?\# "<") ; hash starts comment
(modify-syntax-entry ?\n ">")) ; newline ends comment

;; a statement in Python opens a new block iff it ends with a colon;
;; while conceptually trivial, quoted strings, continuation lines, and
;; comments make this hard. E.g., consider the statement
;; if \
;; 1 \
;; :\
;; \
;; \
;; # comment
;; here we define some regexps to help

(defconst py-stringlit-re "'\\([^'\n\\]\\|\\\\.\\)*'"
"regexp matching a Python string literal")

;; warning!: when [^#'\n\\] was written as [^#'\n\\]+ (i.e., with a
;; '+' suffix), this appeared to run 100x slower in some bad cases.
(defconst py-colon-line-re
(concat
"\\(" "[^#'\n\\]" "\\|" py-stringlit-re "\\|" "\\\\\n" "\\)*"
":"
"\\(" "[ \t]\\|\\\\\n" "\\)*"
"\\(#.*\\)?" "$")
"regexp matching Python statements opening a new block")

;; this is tricky because a trailing backslash does not mean
;; continuation if it's in a comment
(defconst py-continued-re
(concat
"\\(" "[^#'\n\\]" "\\|" py-stringlit-re "\\)*"
"\\\\$")
"regexp matching Python lines that are continued")

;;; General Functions

(defun python-mode nil
"Major mode for editing Python files.

Paragraphs are separated by blank lines only.

\\[python-mode] calls the value of the variable py-mode-hook with no args,
if that value is non-nil.

INTERFACE TO PYTHON INTERPRETER

\\[py-execute-buffer]\tsends the entire buffer to the Python interpreter
\\[py-execute-region]\tsends the current region.
\\[py-shell]\tstarts a Python interpreter window; this will be used by
\tsubsequent \\[py-execute-buffer] or \\[py-execute-region] commands

VARIABLES

py-indent-offset\tindentation increment
py-continuation-offset\textra indentation given to continuation lines

py-continuation-offset is the additional indentation given to the first
continuation line in a multi-line statement. Each subsequent
continuation line in the statement inherits its indentation from the
line that precedes it, so if you don't like the default indentation
given to the first continuation line, change it to something you do like
and Python-mode will automatically use that for the remaining
continuation lines (or, until you change the indentation again).

INDENTATION

Primarily for entering new code:
\t\\[indent-for-tab-command]\tindent line appropriately
\t\\[newline-and-indent]\tinsert newline, then indent
\t\\[py-delete-char]\treduce indentation, or delete single character

Primarily for reindenting existing code:
\t\\[py-indent-region]\treindent region to match its context
\t\\[py-shift-region-left]\tshift region left by py-indent-offset
\t\\[py-shift-region-right]\tshift region right by py-indent-offset

Unlike most programming languages, Python uses indentation, and only
indentation, to specify block structure. Hence the indentation supplied
automatically by Python-mode is just an educated guess: only you know
the block structure you intend, so only you can supply correct
indentation.

The \\[indent-for-tab-command] and \\[newline-and-indent] keys try to suggest plausible indentation, based on
the indentation of preceding statements. E.g., assuming
py-indent-offset is 4, after you enter
\tif a > 0: \\[newline-and-indent]
the cursor will be moved to the position of the `x':
\tif a > 0:
\t x
If you then enter `c = d' \\[newline-and-indent], the cursor will move
to
\tif a > 0:
\t c = d
\t x
Python-mode cannot know whether that's what you intended, or whether
\tif a > 0:
\t c = d
\tx
was your intent. In general, Python-mode either reproduces the
indentation of the preceding (non-blank and non-comment) statement, or
adds an extra py-indent-offset blanks if the preceding statement has
`:' as its last significant (non-whitespace and non-comment) character.

\\[py-delete-char] is handy after \\[newline-and-indent] to reduce excess indentation. It reduces the
indentation of a line by py-indent-offset columns if point is at the
first non-blank character (if any) of a line, or at the end of an
entirely blank line; else it deletes the preceding character, converting
tabs to spaces as needed so that only one character position is deleted.

The remaining `indent' functions apply to a region of Python code. They
assume the block structure (equals indentation, in Python) of the region
is correct, and alter the indentation in various ways while preserving
the block structure:

\\[py-indent-region] reindents a region to match its context and/or with a
different value for the indentation offset. This is useful when code
blocks are moved or yanked, when enclosing control structures are
introduced or removed, or to reformat code using a new value for the
indentation offset. See the function documentation for details.

Warning: indent-region should not normally be used! It calls \\[indent-for-tab-command]
repeatedly, and as explained above, \\[indent-for-tab-command] can't guess the block
structure you intend.

\\[py-shift-region-left] shifts the region left by py-indent-offset columns.

\\[py-shift-region-right] shifts the region right by py-indent-offset columns.

The two `shift' functions above also honor numeric prefix arguments; see
the individual function documentation for details.

MODE MAP
\\{py-mode-map}"

(interactive)
(kill-all-local-variables)
(use-local-map py-mode-map)
(set-syntax-table py-mode-syntax-table)
(setq major-mode 'python-mode mode-name "Python")

(mapcar (function (lambda (x)
(make-local-variable (car x))
(set (car x) (cdr x))))
'( (paragraph-separate . "^[ \t\f]*$")
(paragraph-start . "^[ \t\f]*$")
(require-final-newline . t)
(comment-start . "# ")
(comment-start-skip . "# *")
(comment-column . 40)
(indent-line-function . py-indent-line)))

(run-hooks 'py-mode-hook))

;;; Functions that execute Python commands in a subprocess

(defun py-shell ()
"Start an interactive Python interpreter in another window.
The variable py-python-command names the interpreter."
(interactive)
(require 'shell)
(switch-to-buffer-other-window
(make-shell "Python" py-python-command))
(make-local-variable 'shell-prompt-pattern)
(setq shell-prompt-pattern "^>>> \\|^\\.\\.\\. "))

(defun py-execute-region (start end)
"Send the region between START and END to a Python interpreter.
If there is a *Python* process it is used."
(interactive "r")
(condition-case nil
(process-send-string "Python" (buffer-substring start end))
(error (shell-command-on-region start end py-python-command nil))))

(defun py-execute-buffer nil
"Send the contents of the buffer to a Python interpreter.
If there is a *Python* process buffer it is used."
(interactive)
(py-execute-region (point-min) (point-max)))

;;; Functions for Python style indentation

(defun py-delete-char ()
"Reduce indentation or delete character.
If at first non-blank character, or at end of blank line, reduce
indentation by py-indent-offset columns. Else delete preceding
character, converting tabs to spaces."
(interactive)
(backward-delete-char-untabify
(if (and
(= (current-indentation) (current-column))
(>= (current-indentation) py-indent-offset))
py-indent-offset
1)))

(defun py-indent-line ()
"Fix the indentation of the current line according to Python rules."
(interactive)
(let ( (need (py-compute-indentation)) )
(if (= (current-indentation) need)
nil
(beginning-of-line)
(delete-horizontal-space)
(indent-to need))))

;; go to first line of current statement; usually this is the line we're
;; on, but if we're on the 2nd or following lines of a continuation
;; block, we need to go up to the first line of the block
(defun py-goto-initial-line ()
(while (py-continuation-line-p) (forward-line -1)))

;; t iff on continuation line == preceding line ends with backslash
;; that's not in a comment
(defun py-continuation-line-p ()
(save-excursion
(beginning-of-line)
(and
;; use a cheap test first to avoid the regexp if possible
;; use 'eq' because char-after may return nil
(eq (char-after (- (point) 2)) ?\\ )
(progn
(forward-line -1) ; since eq test passed, there is a line above
(looking-at py-continued-re)))))

(defun py-compute-indentation ()
(save-excursion
(beginning-of-line)
(cond
;; are we on a continuation line?
( (py-continuation-line-p)
(forward-line -1)
(if (py-continuation-line-p) ; on at least 3rd line in block
(current-indentation) ; so just continue the pattern
;; else on 2nd line in block, so indent more
(+ (current-indentation) py-indent-offset
py-continuation-offset)))
;; not on a continuation line

;; if at start of restriction, or on a comment line, assume they
;; intended whatever's there
( (or (bobp) (looking-at "[ \t]*#"))
(current-indentation) )

;; else indentation based on that of the statement that precedes
;; us; use the first line of that statement to establish the base,
;; in case the user forced a non-std indentation for the
;; continuation lines (if any)
( t
;; skip back over blank & comment lines
;; note: will skip a blank or comment line that happens to be
;; a continuation line too
(re-search-backward "^[ \t]*[^ \t#\n]" (point-min) 1)
(py-goto-initial-line)
(if (looking-at py-colon-line-re)
(+ (current-indentation) py-indent-offset)
(current-indentation))))))

(defun py-shift-region (start end count)
(save-excursion
(goto-char end) (beginning-of-line) (setq end (point))
(goto-char start) (beginning-of-line) (setq start (point))
(indent-rigidly start end count)))

(defun py-shift-region-left (start end &optional count)
"Shift region of Python code to the left.
The lines from the line containing the start of the current region up
to (but not including) the line containing the end of the region are
shifted to the left, by py-indent-offset columns.

If a prefix argument is given, the region is instead shifted by that
many columns."
(interactive "*r\nP") ; region; raw prefix arg
(py-shift-region start end
(- (prefix-numeric-value
(or count py-indent-offset)))))

(defun py-shift-region-right (start end &optional count)
"Shift region of Python code to the right.
The lines from the line containing the start of the current region up
to (but not including) the line containing the end of the region are
shifted to the right, by py-indent-offset columns.

If a prefix argument is given, the region is instead shifted by that
many columns."
(interactive "*r\nP") ; region; raw prefix arg
(py-shift-region start end (prefix-numeric-value
(or count py-indent-offset))))

(defun py-indent-region (start end &optional indent-offset)
"Reindent a region of Python code.
The lines from the line containing the start of the current region up
to (but not including) the line containing the end of the region are
reindented. If the first line of the region has a non-whitespace
character in the first column, the first line is left alone and the rest
of the region is reindented with respect to it. Else the entire region
is reindented with respect to the (closest non-blank & non-comment)
statement immediately preceding the region.

This is useful when code blocks are moved or yanked, when enclosing
control structures are introduced or removed, or to reformat code using
a new value for the indentation offset.

If a numeric prefix argument is given, it will be used as the value of
the indentation offset. Else the value of py-indent-offset will be
used.

Warning: The region must be consistently indented before this function
is called! This function does not compute proper indentation from
scratch (that's impossible in Python), it merely adjusts the existing
indentation to be correct in context.

Special cases: whitespace is deleted from entirely blank lines;
continuation lines are shifted by the same amount their base line was
shifted, in order to preserve their relative indentation with respect to
their base line; and comment lines beginning in column 1 are ignored."

(interactive "*r\nP") ; region; raw prefix arg
(save-excursion
(goto-char end) (beginning-of-line) (setq end (point-marker))
(goto-char start) (beginning-of-line)
(let ( (py-indent-offset (prefix-numeric-value
(or indent-offset py-indent-offset)))
(indents '(-1)) ; stack of active indent levels
(target-column 0) ; column to which to indent
(base-shifted-by 0) ; amount last base line was shifted
(indent-base (if (looking-at "[ \t\n]")
(py-compute-indentation)
0))
ci)
(while (< (point) end)
(setq ci (current-indentation))
;; figure out appropriate target column
(cond
( (or (eq (following-char) ?#) ; comment in column 1
(looking-at "[ \t]*$")) ; entirely blank
(setq target-column 0))
( (py-continuation-line-p) ; shift relative to base line
(setq target-column (+ ci base-shifted-by)))
(t ; new base line
(if (> ci (car indents)) ; going deeper; push it
(setq indents (cons ci indents))
;; else we should have seen this indent before
(setq indents (memq ci indents)) ; pop deeper indents
(if (null indents)
(error "Bad indentation in region, at line %d"
(save-restriction
(widen)
(1+ (count-lines 1 (point)))))))
(setq target-column (+ indent-base
(* py-indent-offset
(- (length indents) 2))))
(setq base-shifted-by (- target-column ci))))
;; shift as needed
(if (/= ci target-column)
(progn
(delete-horizontal-space)
(indent-to target-column)))
(forward-line 1))))
(set-marker end nil))

;; To do:
;; - add a newline when executing buffer ending in partial line
;; - suppress prompts when executing regions
;; - switch back to previous buffer when starting shell
;; - support for ptags

>>> END OF MSG