Writing hy code from hy code

| categories: hylang | tags:

Here is one of the main reasons I am interested in a lisp for programming. I want to write programs that write programs. In Python, I have ended up doing things like this where we build up a script with string formatting and manipulation, write it to a file, and run it later or somewhere else. We need this because we run a lot of our calculations through a queue system which runs asynchronously from the work we do in an editor.

import os
for x in [1, 2, 3]:
    fname = 'p{0}.py'.format(x)

    program = '''#!/usr/bin/env python
def f(x):
    return x**{0}

import sys
print f(float(sys.argv[1]))'''.format(x)

    with open(fname, 'w') as f:
        f.write(program)

    os.chmod(fname, 0o755)

Then you can call these now at the command line like:

./p2.py 3
./p3.py 3
9.0
27.0

That is not too bad because the script is simple, but it is tedious to keep the indentation right, it is not always easy to keep track of the arguments (even with numbered indexes, names, etc… in the formatting), there is limited logic you can use in the arguments (e.g. no if/elif/elif/else, etc…), you lose all the value of having an editor in Python mode, so no syntax highlighting, eldoc, code completion, automatic indentation, etc… I don't like it, but it gets the job done.

Lisps allow you to treat code like data, in an editor in lisp-mode, so it should be ideal for this kind of thing. Here we look at getting that done with hy. For the simplest forms, we simply convert the code to a string, which can then be written to a file. You can see we probably got lucky here that the objects in the expression all print in a simple form that allows us to reconstruct the code. You can see here some aspects of Python peeking through the hy implementation. In data/quoted mode, the atoms in the list are not all simple symbols. By the time the program gets to running the code, they have been transformed to objects of various types that need to be handled separately.

(setv program `(+ 4 5))
(print (+ "(" (.join " " (list-comp (str x) [x program])) ")"))
(print (list-comp (type x) [x program]))
(+ 4 5)
[<class 'hy.models.symbol.HySymbol'>, <class 'hy.models.integer.HyInteger'>, <class 'hy.models.integer.HyInteger'>]

Real programs are not this simple, and we need to handle nested expressions and other types of objects. Consider this program. It has many different types in it, and they don't all get represented by the right syntax in print (i.e. with (repr object).

(let [program `(list {"a" 1 "b" 3} "b" 3 3.0 [1 1 2] :keyword (lambda [x] (* x 3)))]
  (print (list-comp (type x) [x program]))
  (for [x program] (print (.format "{0!r}" x))))
[<class 'hy.models.symbol.HySymbol'>, <class 'hy.models.dict.HyDict'>, <class 'hy.models.string.HyString'>, <class 'hy.models.integer.HyInteger'>, <class 'hy.models.float.HyFloat'>, <class 'hy.models.list.HyList'>, <class 'hy.models.keyword.HyKeyword'>, <class 'hy.models.expression.HyExpression'>]
u'list'
{u'a' 1L u'b' 3L}
u'b'
3L
3.0
[1L 1L 2L]
u'\ufdd0:keyword'
(u'lambda' [u'x'] (u'*' u'x' 3L))

Next we make a recursive expression to handle some of these. It is recursive to handle nested expressions. Here are the things in hy.models that might need special treatment. We make sure to wrap expressions in (), lists in [], dictionaries in {}, and strings in "". Keywords have a unicode character put in front of them, so we cut that off. Everything else seems to be ok to just convert to a string. This function gets tangled to serialize.hy so it can be used in subsequent code examples.

(import hy)

(defn stringify [form &optional debug]
  "Convert a FORM to a string."
  (when debug (print (.format "{0}: {1}" form (type form))))
  (cond
   [(isinstance form hy.models.expression.HyExpression)
    (+ "(" (.join " " (list-comp (stringify x debug) [x form])) ")")]
   [(isinstance form hy.models.dict.HyDict)
    (+ "{" (.join " " (list-comp (stringify x debug) [x form])) "}")]
   [(isinstance form hy.models.list.HyList)
    (+ "[" (.join " " (list-comp (stringify x debug) [x form])) "]")]
   [(isinstance form hy.models.symbol.HySymbol)
    (.format "{}" form)]
   [(isinstance form hy.models.keyword.HyKeyword)
    ;; these have some unicode prefix I want to remove
    (.format "{}" (cut form 1))]
   [(or (isinstance form hy.models.string.HyString)
        (isinstance form unicode))
    (.format "\"{}\"" form)]
   [true
    (.format "{}" form)]))

Now, some examples. These cover most of what I can imagine coming up.

(import [serialize [stringify]])  ;; tangled from the block above

;; some examples that cover most of what I am doing.
(print (stringify `(+ 5 6.0)))
(print (stringify `(defn f [x] (* 2 x))))
(print (stringify `(get {"a" 1 "b" 3} "b")))
(print (stringify `(print (+ 4 5 (* 6 7)))))
(print (stringify `(import [numpy :as np])))
(print (stringify `(import [scipy.optimize [fsolve]])))
(print (stringify `(set [2 2 3])))
(print (stringify `(complex 4 5)))
(print (stringify `(cons 4 5)))
(+ 5 6.0)
(defn f [x] (* 2 x))
(get {"a" 1 "b" 3} "b")
(print (+ 4 5 (* 6 7)))
(import [numpy :as np])
(import [scipy.optimize [fsolve]])
(set [2 2 3])
(complex 4 5)
(cons 4 5)

Those all look promising. Maybe it looks like nothing happened. Something did happen! We took code that was quoted (and hence like a list of data), and converted it into a string representation of the code. Now that we have a string form, we can do things like write it to a file.

Next, we add a function that can write that to an executable script.

(defn scriptify [form fname]
  (with [f (open fname "w")]
        (.write f "#!/usr/bin/env hy\n")
        (.write f (stringify form)))
  (import os)
  (os.chmod fname 0o755))

Here is an example

(import [serialize [stringify scriptify]])

;; make functions
(for [x (range 1 4)]
  (scriptify
   `(do
     (import sys)
     (defn f [x]
       (** x ~x))
     (print (f (float (get sys.argv 1)))))
   ;; fname to write to
   (.format "h{}.hy" x)))

Here is the proof those programs got created.

ls h[0-9].hy
echo
cat h1.hy
h1.hy
h2.hy
h3.hy

#!/usr/bin/env hy
(do (import sys) (defn f [x] (** x 1)) (print (f (float (get sys.argv 1)))))

The code is all on one line, which doesn't matter or hy. Yep, if it didn't occur to you, we could take those strings and send them over the internet so they could get executed remotely. They are one read-str and eval away from being lisp code again. Yes there are security concerns with that. And an amazing way to get something done.

(import [serialize [*]])
(print (eval (read-str (stringify `(+ 4 5)))))
9

We can run those programs at the command line:

hy h2.hy 10
hy h3.hy 10
100.0
1000.0

Now for a more realistic test. I make some scripts related to the kinds of molecular simulation we do. These scripts just setup a model of bulk Cu or Pt, and print the generated object. In a real application we would compute some thing from this object.

(import [serialize [stringify scriptify]])

(for [element ["Cu" "Pt"]]
  (scriptify `(do (import [ase.lattice [bulk]])
                  ;; we have to str the element to avoid a unicode error
                  ;; ase does not do unicode.
                  (setv atoms (bulk (str ~element) :a 4.5 :cubic True))
                  (print atoms))
             (.format "{}.hy" element)))

Here is what one of those scripts looks like

cat Pt.hy
#!/usr/bin/env hy
(do (import [ase.lattice [bulk]]) (setv atoms (bulk (str "Pt") :a 4.5 :cubic True)) (print atoms))

Note the comments are not in the generated script. These are evidently ignored in hy, and are not even elements. We can run this at the command line to. If this script did an actual calculation, we would have a mechanism to generate simulation scripts that run calculations and output the results we want!

hy Pt.hy
Atoms(symbols='Pt4', positions=..., cell=[4.5, 4.5, 4.5], pbc=[True, True, True])

So, we can write programs that write programs!

1 Serialize as compiled Python

It could be convenient to run the generated programs from Python instead of hy. Here we consider how to do that. I adapted this code from hy.importer.write_hy_as_pyc.

(import [hy.importer :as hi])
(import [hy._compat [PY3 PY33 MAGIC wr_long long_type]])
(import marshal)
(import os)

(defn hy2pyc [code fname]
  "Write CODE as Python compiled byte-code in FNAME."

  (setv program (stringify code))

  (setv _ast (hi.import_buffer_to_ast
              program
              "main"))

  (setv code (hi.ast_compile _ast "<string>" "exec"))

  ;; create file and close it so we get the size
  (with [f (open fname "wb")] nil)
  (with [f (open fname "wb")]
        (try
         (setv st (os.fstat (f.fileno)))
         (except [e AttributeError]
           (setv st (os.stat fname))))
        (setv timestamp (long_type (. st st_mtime))))
  (with [fc (open fname "wb")]
        (if PY3
          (.write fc b"\0\0\0\0") ; I amnot sure this is right in hy with b""
          (.write fc "\0\0\0\0"))
        (wr_long fc timestamp)
        (when PY33
          (wr_long fc st.st_size))
        (.dump marshal code fc)
        (.flush fc)
        (.seek fc 0 0)
        (.write fc MAGIC)))

Now for an example.

(import [serialize [*]])

(hy2pyc `(do
          (import sys)
          (defn f [x]
            (** x 3))
          (print (.format "Hy! {0}^3 is {1}."
                          (get sys.argv 1)
                          (f (float (get sys.argv 1))))))
          "main.pyc")

Now we can execute it like this.

python main.pyc 4
Hy! 4^3 is 64.0.

Well, that worked fine too!

2 Summary

In some ways this is similar to the string manipulation approach (they both generate programs after all), but there are these differences:

  1. We do not have the indentation issues of generating Python.
  2. The code is edited in hy-mode with full language support.
  3. Instead of formatting, and string replacements, you have to think of what is quoted and what is evaluated. I find that easier to think about than with strings.

There are some ways we could simplify this perhaps. In this post I added code to the built in python types so they could be represented as lisp code. We could add something like this to each of the hy.model objects so they natively can be represented as hy code. The repr functions on these should technically be used for that I think. On the other hand, this serialize code works fine, and lets me do what I want. It is pretty cool this is all possible!

Copyright (C) 2016 by John Kitchin. See the License for information about copying.

org-mode source

Org-mode version = 8.2.10

Discuss on Twitter