Cl-Protobufs Enumerations

In the last few posts we discussed family life, and before that we created a toy application using cl-protobufs and the ACE lisp libraries. Today we will dive deeper into the cl-protobufs library by looking at Enumerations. We will first discuss enumerations in Protocol Buffers, then we will discuss Lisp Protocol Buffer enums.

Enums:

Most modern languages have a concept of enums. In C++ enumerations are compiled down to integers and you are free to use integer equality. For example

enum Fish {
 salmon,
 trout,
}

void main {
  std::cout << salmon == 0 << std::endl;
}

Will print true. This is in many ways wonderful: enums compile down to integers and there’s no cost to using them. It is baked into the language! 

Protocol Buffers are available for many languages, not just C++. You can find the documentation for Protocol Buffer enums here: 

https://developers.google.com/protocol-buffers/docs/proto#enum

Each language has its own way to support enumeration types. Languages like C++ and Java, which have built-in support for enumeration types, can treat protobuf enums like any other enum. The above enum could be written (with some caveats) in Protocol Buffer as:

enum Fish {
  salmon = 0;
  trout = 1;
}

You should be careful though, Protoc will give a compile warning that enum 0 should be a default value, so 

enum Fish {
  default = 0;
  salmon = 1;
  trout = 2;
}

Is preferred.

Let’s get into some detail for the two variants of Protocol Buffers in use.

// Example message to use below.
enum Fish {
  default = 0;
  salmon = 1;
  trout = 2;
}

message Meal {
  {optional} Fish fish;
}

The `optional` label will only be written for proto 2.

Proto 2:

In proto 2 we can always tell whether `Meal.fish` was set. If the field has the `required` label then it must be set, by definition. (But the `required` label is considered harmful; don’t use it.) If the field has an `optional` label then we can check if it has been set or not, so again a default value isn’t necessary.

If the enum is updated to:

// Example message to use below.
enum Fish {
  default = 0;
  salmon = 1;
  trout = 2;
  tilapia = 3;
}

and someone sends fish = tilapia to a system where tilapia isn’t a valid entry, the library is allowed to do whatever it wants! In Java it sets it to the first entry, so Meal.fish would be default! 

Proto 3

In proto3 if the value of Meal.fish is not set, calling its accessor will return the default value which is always the zero value. There is no way to check whether the field was explicitly set. A default value (i.e., a name that maps to the value zero) must always be given, else the user will get a compile error.

If the Fish enum was updated to contain tilapia as above, and someone sent a proto message containing tilapia to a system with an older program that had the message not containing tilapia, the deserializer should save the enum value. That is, the underlying data structure should know it received a “3” for the fish field in Meal. How the accessors return this value is language dependent. Re-serializing the message should preserve this “unrecognized” value.

A common example is: A gateway system wants to do something with the message and then forward it to another system. Even though the middle system has an older schema for the Fish message it needs to forward all the data to the downstream system.

Cl-protobufs:

Now that we understand the basics of enumerations, it is important to understand how cl-protobufs records enumeration values

Lisp as a language does not have a concept of enumerations; what it does understand is keywords. Taking fish as above and running protoc we will get (see readme https://github.com/qitab/cl-protobufs/#enums):

(deftype fish ‘(:default :salmon :trout))

(defun fish-to-int (keyword) 
  (ecase keyword
    (:default 0)
    (:salmon 1)
    (:trout 2)))

(defun int-to-fish (int)
  (ecase int
    (0 :default)
    (1 :salmon)
    (2 :trout)))

Looking at the tilapia example, the enum deserializer preserves the unknown field in both proto2 and proto3. Calling an accessor on a field containing an unknown value will return :%undefined-n. So for tilapia we will see :%undefined-3.

Warning: To get this to work properly we have to remove type checks from protocol buffer enumerations. You can set the field value in a lisp protocol buffer message to any keyword you want, but you will get a serialization error when you try to serialize. This was a long discussion internally, but that design discussion could turn into a blog post of its own.

Conclusion:

The enumeration fields in cl-protobufs are fully proto2 and proto3 compliant. To do this we had to remove type checking. As a consumer, it is suggested that you always type check and handle undefined enumeration values in your usage of protocol buffer enums. We give you a deftype to easily check.

I hope you have enjoyed this deep dive into cl-protobuf enums. We strive to remove as many gotchas as possible.


Thanks to Ron and Carl for the continual copy edits and improvements!

Proto Cache: Flags and Hooks

Today’s Updates

Last week we made our Pub/Sub application use protocol buffer objects for most of its internal state. This week we’ll take advantage of that change by setting startup and shutdown hooks to load state and save state respectively. We will add flags so someone starting up our application can set the load and save files on the command line. We will then package our application into an executable with a new asdf command.

Code Changes

Proto-cache.lisp

Defpackage Updates:

We will use ace.core.hook to implement our load and exit hooks. We will show how to make methods that will run at load and exit time when we use this library in the code below. In the defpackage we use the nickname hook. The library is available in the ace.core repository.

We use ace.flag as our command line flag parsing library. This is a command line flag library used extensively at Google for our lisp executables. The library can be found in the ace.flag repository.

Flag definitions:

We define three command line flags:

  • flag::*load-file*
  • flag::*save-file*
  • flag::*new-subscriber* 
    • This flag is used for testing purposes. It should be removed in the future.
  • flag::*help*

The definitions all look the same, we will look at flag::*load-file* as an example:

(flag:define flag::*load-file* ""
  "Specifies the file from which to load the PROTO-CACHE on start up."
  :type string)
  • We use the flag:define macro to define a flag. Please see the code for complete documentation of this macro (REAME.md update coming). We only use a small subset of the ace.flag package.
  • flag::*load-file*: This is the global where the parsed command line flag will be stored.
  • The documentation string to document the flag. If flag:print-help is called this documentation will be printed:

    --load-file (Determines the file to load PROTO-CACHE from on startup)

     Type: STRING

  • :type : The type of the flag. Here we have a string.

We use the symbol-name string of the global in lowercase as the command line input. 

For example:

  1. flag::*load-file* becomes --load-file
  2. flag::*load_file* becomes –load_file

The :name or :names key in the flag:define macro will let users select their own names for the command line input instead of this default.

Main definition:

We want to create a binary for our application. Since we have no way to add publishers and subscribers outside of the repl we define a dummy main that adds publishers and subscribers for us:

(defun main ()
  (register-publisher "pika" "chu")
  (register-subscriber "pika" flag::*new-subscriber*)
  (update-publisher-any
    "pika" "chu"
    (google:make-any :type-url "a"))
  ;; Sleep to make sure running threads exit.
  (sleep 2))

After running the application we can check for a new subscriber URL in the saved proto-cache application state file. I will show this shortly.

Load/Exit hooks:

We have several pre-made hooks defined in ace.core.hook. Two useful functions are ace.core.hook:at-restart and ace.core.hook:at-exit. As one can imagine, at-restart runs when the lisp image starts up, and at-exit runs when the lisp image is about to exit.

The first thing we do when we start our application is parse our command line:

(defmethod hook::at-restart parse-command-line ()
  "Parse the command line flags."
  (flag:parse-command-line)
  (when flag::*help*
    (flag:print-help)))

You MUST call flag:parse-command-line for the defined command line flags to have non default values.

We also print a help menu  if --help was passed in.

Then we can load our proto if the load-file flag was passed in:

(defmethod hook::at-restart load-proto-cache :after parse-command-line  ()
  "Load the command line specified file at startup."
  (when (string/= flag::*load-file* "")
    (load-state-from-file :filename flag::*load-file*)))                                                                                            

We see an :after clause in our defmethod. We want the load-proto-cache method called during start-up but after we have parsed the command line so flag::*load-file* has been properly set. 

Note: The defmethod here uses a special defmethod syntax added in ace.core.hook. Please see the hook-method documentation for complete details.

Finally we save our image state at exit:

(defmethod hook::at-exit save-proto-cache ()
  "Save the command line specified file at exit."
  (when (string/= flag::*save-file* "")
    (save-state-to-file :filename flag::*save-file*)))

The attentive reader will notice our main function never explicitly called any of these hook functions…

Proto-cache.asd:

We add code to build an executable using asdf:

(defpackage :proto-cache …
  :build-operation "program-op"
  :build-pathname "proto-cache"
  :entry-point "proto-cache:main")

This is a program-op. The executable pathname is relative, we save the binary as “proto-cache” in the same directory as our proto-cache code. The entry point function is proto-cache:main.

We may then call: 

sbcl --eval "(asdf:operate :build-op :proto-cache)" 

at the command line to create our binary.

Running our binary:

With our binary built we can call:

./proto-cache  --save-file /tmp/first.proto --new-subscriber http://www.google.com

Trying cat /tmp/first.pb:

pika'
http://www.google.com
a?pika"chujg

These are serialized values so one shouldn’t try to understand the output so much. We can see “http://www.google.com”, “pika”, and “chu” are all saved.

Calling

./proto-cache   --load-file /tmp/first.pb --save-file /tmp/first.pb --new-subscriber http://www.altavista.com

And then cat /tmp/first.pb:

I
pikaA
?http://www.altavista.com
http://www.google.com
a?pika"chujg
“

Finally calling  ./proto-cache  --help

We get:

Flags from ace.flag:

    --lisp-global-flags
     (When provided, allows specifying global and special variables as a flag on the command line.
       The values are NIL - for none, :external - for package external, and T - for all flags.)
     Type: ACE.FLAG::GLOBAL-FLAGS

    --help (Whether to print help) Type: BOOLEAN Value: T

    --load-file (Determines the file to load PROTO-CACHE from on startup)
     Type: STRING
     Value: ""

    --new-subscriber (URL for a new subscriber, just for testing)
     Type: STRING
     Value: ""

    --lisp-normalize-flags
     (When non-nil the parsed flags will be transformed into a normalized form.
       The normalized form contains hyphens in place of underscores, trims '*' characters,
       and puts the name into lower case for flags names longer than one character.)
     Type: BOOLEAN

    --save-file (Determines the file to save PROTO-CACHE from on shutdown)
     Type: STRING
     Value: ""

This shows our provided documentation of the command line flags as expected.

Conclusions:

Today we added command line flags, load and exit hooks, and made our application buildable as an executable. We can build our executable and distribute it as we see fit. We can direct it to load and save the application state to user specified files without updating the code. There is still much to do before it’s done but this is slowly becoming a usable application.

There are a few additions I would like to make, but I have a second child coming soon. This may (or may not) be my last technical blog post for quite some time. I hope this sequence of Proto Cache posts has been useful thus far, and I hope to have more in the future.

Thanks to Ron Gut and Carl Gay for copious edits and comments.

Proto Cache: Saving State

Todays Updates:

In our last post we implemented a basic Pub Sub application that stores an Any protocol buffer message and a list of subscribers. When the Any protocol buffer message gets updated we send the new Any message in the body of an http request to all of the subscribers in the subscribe-list. 

Today we will update our service to save all of the state in a protocol buffer message. We will also add functionality to save and load the state of the Proto Cache application. 

Note: Viewing the previous post is highly suggested!

Code Updates:

Note: We use red to denote removed code and green to denote added code.

pub-sub-details.proto

`syntax = proto3`

We will use proto3 syntax. I’ve yet to find a great reason to choose proto3 over proto2, but I’ve also yet to find a great reason to choose proto2 over proto3. The biggest reason to choose proto3 over proto2 is that most people use proto3, but the Any proto will store proto2 or proto3 messages regardless.

import “any.proto”

Our users are publishing Any messages to their clients, so we must store them in our application state. This requires us to include the any.proto file in our proto file.

message PubSubDetails

This contains (almost) all of the state needed for the publish subscribe service for one user:

  • repeated string subscriber_list
  • google.protobuf.Any current_message
    • This is the latest Any message that the publisher has stored in the Proto Cache.
  • string username
  • string password
    • For any kind of production use this should be salted and hashed. 

message PubSubDetailsCache

This message contains one entry, a map from a string (which will be a username for a publisher) to a PubSubDetails instance. The attentive reader will notice that we save the username twice, once in the PubSubDetails message and once in the PubSubDetailsCache map as the key. This will be explained when we discuss changes to the proto-cache.lisp file.

proto-cache.asd

The only difference in proto-cache.asd from all of the other asd files we’ve seen using protocol buffers is the use of a protocol buffer message in a package different from our current package. That is, any.proto resides in the cl-protobufs package but we are including it in the pub-sub-details.proto file in proto-cache.

To allow the protoc compiler to find the any.proto file we give it a :proto-search-path containing the path to the any.proto file. 


...
    :components
    ((:protobuf-source-file "pub-sub-details"
      :proto-pathname "pub-sub-details.proto"
      :proto-search-path ("../cl-protobufs/google/protobuf/"))
...

Note: We use a relative path: “../cl-protobufs/google/protobuf/”, which may not work for you. Please adjust to reflect your set-up.

We don’t need a component in our defsystem to load the any.proto file into our lisp image since it’s already loaded by cl-protobufs. We might want to just to recognize the direct dependency of the any.proto file. 

proto-cache.lisp

Defpackage updates:

We are adding new user invokable functionality so we export:

  • save-state-to-file
  • load-state-from-file

local-nicknames:

  • cl-protobufs.pub-sub-details as psd
    • This is merely to save typing. The cl-protobufs.pub-sub-details is the package that contains the functionality derived from pub-sub-details.proto.

Globals:

*cache*: This will be a protocol buffer message containing a hash table with string keys and pub-sub-details messages. 

(defvar *cache* (make-hash-table :test 'equal))
(defvar *cache* (psd:make-pub-sub-details-cache))

*mutex-for-pub-sub-details*: Protocol buffer messages can’t store lisp mutexes. Instead, we store the mutex for a pub-sub-details in a new hash-table with string (username) keys.

make-pub-sub-details:

This function makes a psd:pub-sub-details protocol buffer message. It’s almost the same as the previous iteration of pub-sub-details except for the addition of username.


...
  (make-instance 'pub-sub-details :password password))
  (psd:make-pub-sub-details :username username
                            :password password
                            :current-any (google:make-any))
...

(defmethod (setf psd:current-any) (new-value (psd psd:pub-sub-details))

This is really a family of functions:

  • :around: When someone tries to set the current-message value on a pub-sub-details struct we want to write-protect the pub-sub-details entry. We use an around method which activates before any call to the psd:current-any setter. Here we take the username from the pub-sub-details message and write-hold the corresponding mutex in the *mutex-for-pub-sub-details* global hash-table. Then we call call-next-method which will call the main (setf current-any) method.
(defmethod (setf current-any) (new-value (psd pub-sub-details))
(defmethod (setf psd:current-any) :around (new-value (psd psd:pub-sub-details))
  • (setf psd:current-any): This is the actual defmethod defined in cl-protobufs.pub-sub-details. It sets the current-messaeg slot on the message struct.
  • :after: This occurs after the current-any setter was called. We send an http call to all of the subscribers on the pub-sub-details subscriber list. Minus the addition of the psd package prefix to accessor functions of pub-sub-details this function wasn’t changed.

 register-publisher:

The main differences between the last iteration of proto-cache and this one are:

  1. This *-gethash method is exported by cl-protobufs.pub-sub-details so the user can call gethash on the hash-table in a map field of a protocol buffer message.
    • (gethash username *cache*)
    • (psd:pub-sub-cache-gethash username *cache*)
  2. We add a mutex to the *mutex-for-pub-sub-details* hash-table with the key being the username string sent to register-publisher.
  3. We return t if the new user was registered successfully, nil otherwise.

register-subscriber and update-publisher-any:

  1. The main difference here is:
    1. (gethash publisher *cache*)
    2. (psd:pub-sub-cache-gethash publisher *cache*)
  2. We have to use the psd package prefix to all of the accessors to pub-sub-details

save-state-to-file:

(defun save-state-to-file (&key (filename "/tmp/proto-cache.txt"))
  "Save the current state of the proto cache to *cache* global
   to FILENAME as a serialized protocol buffer message."
  (act:with-frmutex-read (*cache-mutex*)
    (with-open-file (stream filename :direction :output
                                     :element-type '(unsigned-byte 8))
      (cl-protobufs:serialize-to-stream stream *cache*))))

This is a function that accepts a filename as a string, opens the file for output, and calls cl-protobufs:serialize-to-stream. This is all we need to do to save the state of our applications!

load-state-from-file:

We need to do three things:

  1. Open a file for reading and deserialize the Proto Cache state saved by save-sate-to-file
  2. Create a new map containing the mutexes for each username.
  3. Set the new state into the *cache* global and the new mutex hash-table in *mutex-for-pub-sub-details*.
    1. We do write-hold the *cache-mutex* but I would suggest only loading the saved state when Proto Cache is started.
(defun load-state-from-file (&key (filename "/tmp/proto-cache.txt"))                                                                                   
  "Load the saved *cache* globals from FILENAME. Also creates                                                                                          
   all of the fr-mutexes that should be in *mutex-for-pub-sub-details*."
  (let ((new-cache
          (with-open-file (stream filename :element-type '(unsigned-byte 8))
            (cl-protobufs:deserialize-from-stream
              'psd:pub-sub-details-cache :stream stream)))
        (new-mutex-for-pub-sub-details (make-hash-table :test 'equal)))
    (loop for key being the hash-keys of (psd:pub-sub-cache new-cache)
          do
             (setf (gethash key new-mutex-for-pub-sub-details)
                   (act:make-frmutex)))
    (act:with-frmutex-write (*cache-mutex*)
      (setf *mutex-for-pub-sub-details* new-mutex-for-pub-sub-details
            *cache* new-cache))))

Conclusion:

The main update we made today was defining pub-sub-details in a .proto file instead of a Common Lisp defclass form. The biggest downside is the requirement to save the pub-sub-details mutex in a separate hash-table. For this cost, we:

  1. Gained the ability to save our application state with one call to cl-protobufs:serialize-to-stream.
  2. Gained the ability to load our application with little more then one call to cl-protobufs:deserialize-from-stream.

We were also able to utilize the setf methods defined in cl-protobufs to create :around and :after methods.

Note: Nearly all services will be amenable to storing their state in protocol buffer messages.

I hope the reader has gained some insight into how they can use cl-protobufs in their application even if their application doesn’t make http-requests. Being able to save the state of a running program and load it for later use is very important in most applications, and protocol buffers make this task simple.

Thank you for reading!

Thanks to Ron, Carl, and Ben for edits!

Proto Cache: Implementing Basic Pub Sub

Today’s Updates

In our last post we saw some of the features of the ace.core.defun and ace.core.thread libraries by creating a thread-safe cache of the Any protocol buffer object. Today we are going to update the proto-cache repository to implement publisher/subscriber features. This will allow a publisher to publish a feed of Any messages and a subscriber to subscribe to such a  feed. 

It is expected (but not required) that the reader has read the previous post Proto Cache: A Caching Story. That post details some of the functions and objects you will see in today’s code.

Note: This is a basic implementation, not one ready for production use. This will serve as our working project going forward.

Code Updates

Proto-cache.asd

We want subscribers to be able to get new versions of an Any protocol buffer message. On the web, the usual way to receive messages is over HTTP. We use the Drakma HTTP client. You can see we added :drakma to the depends-on list in the defsystem.

Proto-cache.lisp

There are three major regions to this code. The first region is the global objects that make up the cache. The second is the definition of a new class, pub-sub-details. Finally the actual publisher-subscriber functions are at the bottom of the page.

Global objects:

The global objects section looks much like it did in our previous post. We update the *cache* hash-table to use equal as its test function and we are going to make the keys to this cache be username strings.

Pub-sub-details class:

The global objects section looks much like it did in our previous post. We update the *cache* hash-table to use equal as its test function and we are going to make the keys to this cache be username strings.

The pub-sub-details class contains the data we need to keep track of the publisher and subscriber features:

  • subscriber-list: This will be a list of the HTTP endpoints to send the Any messages to after the Any message is updated. Currently, we only allow for an HTTP message string. Future implementations should allow for security functionality on those endpoints.
  • current-any: The current Any message that the publisher has supplied.
  • mutex: A fr-mutex to protect the current-any slot. This should be read-held to get the current-any and it should be read-held to set a new current-any message.
  • password: The password for the subscriber held as a string. 

We shouldn’t be saving the password being as a string in the pub-sub-details class. At a minimum we should be salting and hashing this value. In the future we should implement an account system for readers and subscribers giving access to reading and updating the pub-sub-details. As this is only instructional and not production-ready code, I feel okay leaving it as is for the moment.

We create a make-pub-sub-details function that will create a pub-sub-details object with a given password. The register function doesn’t allow the user to set an Any message at creation time, and none of the other slots are useful to the publisher.

We create an accessor method to set the any-message value slot. We also create an :after method to send the Any message to any listening subscribers by iterating through the subscriber list and calling a drakma:http-request. We wrap this in unwind-protect so an IO failure doesn’t stop other subscribers from getting the message.

Finally we add a setter function for the subscriber list.

Function definitions:

Register-publisher:

This function is the registration point for a new publisher. It is almost the same as set-in-cache from our previous post except it checks that an entry in the cache for the soon-to-be-registered publisher doesn’t already exist. It would be bad to let a new publisher overwrite an existing publisher.

Register-subscriber:

Here we use a new macro, ace.core.etc:clet from the etc package in ace.core.

(defun register-subscriber (publisher address)
  "Register a new subscriber to a publisher."
  (ace:clet ((ps-struct
               (act:with-frmutex-read (*cache-mutex*)
                 (gethash publisher *cache*)))
             (ps-mutex (mutex ps-struct)))
    (act:with-frmutex-write (ps-mutex)
      (push address (subscriber-list ps-struct)))))

In the code below we search the cache for a user entry, if the entry is found then ps-struct will be non-nil and we can evaluate the body adding the subscriber to the list. If the subscriber is not found we return nil.

Update-publisher-any:

(defun update-publisher-any (username password any)
  "Updates the google:any message for a publisher
   with a specified username and password.
   The actual subscriber calls happen in a separate thread
   but 'T is returned to the user to indicate the any
   was truly updated."
  (ace:clet ((ps-class
              (act:with-frmutex-read (*cache-mutex*)
                (gethash username *cache*)))
             (correct-password (string= (password ps-class)
                                        password)))
    (declare (ignore correct-password))
    (act:make-thread
     (lambda (ps-class)
       (setf (current-any ps-class) any))
     :arguments (list ps-class))
    t))

In the update-publisher-any code we use clet to verify that the publisher exists and the password is found. We ignore the correct-password entry though.

We don’t want the publisher to be thread-blocked while we send the new message to all of the subscribers so we update the current-any in a separate thread. To do this we use the ace.core.thread function make-thread. A keen reader will see for SBCL this calls the sbcl make-thread function, otherwise it calls bordeaux-threads make-thread function.

If we are able to find a publisher with the correct password we return T to show success.

Conclusion

In today’s post we have made a basic publisher-subscriber library that will send an Any protocol buffer message to a list of subscribers. We have detailed some new functions that we used in ace.core. We have also listed some of the problems with this library. The code has evolved substantially from the previous post but it still has a long way to go before being production-ready.

Thank you for your reading!


Ron Gut, Carl Gay, and Ben Kuehnert gave comments and edits to this post.

Proto Cache: A Caching Story

What is Proto-Cache?

I’ve been working internally at Google to open source several libraries including cl-protobufs and a series of utility libraries we call “ace”. I wrote several blog posts making an HTTP server that takes in either protocol buffers or JSON strings and responds in kind. I think I have worked enough on Mortgage Server and wish to work on a different project.

Proto-cache will grow up to be a pub-sub system that takes in google.protobuf:any protos and send them to users over http requests. I’m developing it to showcase the ace.core library and the Any proto well-known-type. In this post we create a cache system which stores google.protobuf.any messages in a hash-table keyed off of a symbol.

The current incarnation of Proto Cache:

The code can be found here: https://github.com/Slids/proto-cache

Proto-cache.asd:

This is remarkable in-as-much as cl-protobufs isn’t required for the defsystem! It’s not required at all, but we do require the cl-protobufs.google.protobuf:any protocol buffer message object. Right now we are only adding and getting it from the cache. This allows us to store a protocol buffer message object that any user system can parse by calling unpack-any. We never have to understand the message inside.

Proto-cache.lisp:

The actual implementation. We give three different functions:

  • get-from-cache
  • set-in-cache
  • remove-from-cache

We also have a:

  • fast-read mutex
  • hash-table

Note: The ace.core library can be found at: https://github.com/cybersurf/ace.core

Fast-read mutex (fr-mutex):

The first interesting thing to note is the fast-read mutex. This can be found in the ace.core.thread package included in the ace.core utility library. This allows for mutex free reads of a protected region of code. One has to call:

  • (with-frmutex-read (fr-mutex) body)
  • (with-frmutex-write (fr-mutex) body)

If the body of with-frmutex-read is finished with nobody calling with-frmutex-write then the value is returned. If someone calls with-frmutex-write while another thread is in with-frmutex-read then the body of with-frmutex-read has to be re-run. One should be careful to not modify state in the with-frmutex-read body.

Discussion About the Individual Functions

get-from-cache:

(acd:defun* get-from-cache (key)
  "Get the any message from cache with KEY."
  (declare (acd:self (symbol) google:any))
  (act:with-frmutex-read (cache-mutex)
    (gethash key cache)))


This function uses the defun* form from ace.core.defun. It looks the same as a standard defun except has a new declare statement. The declare statement takes the form

(declare (acd:self (lambda-list-type-declarations) output-declaration))

In this function we state that the input KEY must be a symbol and the return value is going to be a google:any protobuf message. The output declaration is optional. For all of the options please see the macro definition for ace.core.defun:defun*.

The with-fr-mutex-read macro is also being used.

Note in the macro’s body we only do a simple accessor call into a hash-table. Safety is not guaranteed, only consistency.

set-in-cache:

(acd:defun* set-in-cache (key any)
  "Set the ANY message in cache with KEY."
  (declare (acd:self (symbol google:any) google:any))
  (act:with-frmutex-write (cache-mutex)
    (setf (gethash key cache) any)))

We see that the new defun* call is used. In this case we have two inputs, KEY will be a symbol ANY will be a google:any proto message. We also see that we will return a google:any proto message.

The with-frmutex-write macro is being used. The only thing that is done in the body is setting a cache value. If we try to get a message from the cache and set a message into the cache, it is possible a reader will have to read multiple times. In systems where readers are more common than writers fr-mutexes and spinlocking are much faster than having readers lock a mutex for every read..

remove-from-cache:

We omit this function in this write-up for brevity.

Conclusion:

Fast-read mutexes like the one found in ace.core.thread are incredibly useful tools. Having to access a mutex can be slow even in cases where that mutex is never locked. I believe this is one of the more useful additions in the ace.core library.

The new defun* macro found in ace.core.defun for creating function definitions is more mixed. I find a lack of clarity in mapping the lambda list s-expression in the defun statement to the s-expression in the declaration. Others may find it provides nicer syntax and the clarity is more obvious.

Future posts will show the use of the any protocol buffer message.

As usual Carl Gay gave copious edits and suggestions.

Mortgage Server on a Raspberry Pi

In the last post we discussed creating a server to calculate an amortization schedule that takes and returns both protocol buffer messages and JSON. In this post we will discuss hosting this server on a Raspberry Pi. There are some pitfalls, and the story isn’t complete, but it’s still fairly compelling.

What We Will Use:

Hardware:

We will use a Raspberry Pi 3 model B as our server. We will use the stock operating system Raspbian. This SOC has a quad core 64-bit processor with floating point on chip. The operating system itself is 32-bit which makes the processor run on 32-bit mode.

Software:

We will be using SBCL as our Common Lisp, CL-PROTOBUFS as our protocol buffer and JSON library, and Hunchentoot as our web server.

Problems

1. SBCL on a Raspbian

When trying to run the mortgage-info server on Raspbian the first error I got was an inability to load the lisp file generated by protoc. On contacting Doug Katzman he noted I was running an old version of SBCL. The Raspbian apt-get repository has an old version of SBCL. If someone desires to run SBCL on a Raspberry Pi they should follow the binary installation instructions here: http://www.sbcl.org/getting.html.

2. CL-Protobufs on a 32-Bit OS

The cl-protobufs library has been optimized to run on a 64-bit x86 platform. The Raspberry Pi environment is 32-bit arm. As noted before, the 32-bit arm environment is supported by SBCL. I don’t think anyone has attempted to run cl-protobufs on the 32-bit arm environment running SBCL. After modifying cl-protobufs.asd to have float-bits.lisp loaded on SBCL not running in 64-bit we could quickload mortgage-info into a repl.

3. Bugs in the mortgage-info repo  

There were several bugs I fixed in my very limited testing of the mortgage info repo, as well as some bugs that are still existent. 

  1. When trying to set numbers in the proto message structs I had to coerce them to double-float. I’m not sure why… This works on SBCL running on the x86-64 without the coercions.
  2. A division by 0 bug if the entered interest rate is 0.
  3. The possibility of having 0 as the number of repayment periods. I added an assertion so we will return a 500 stating the assertion was hit. We should have a more graceful error message than a stack trace, but this is currently only a proof of concept.
  4. The mortgage.proto file had interest as an integer, but interest is usually a float divisible by .125. 
  5. We have rounding problems if the interest rate is too high (say 99%). We only ever pay interest and the amount never goes down, at least with a 300 payment period. This is most likely due to rounding, we do not accept fractional pennies. This is okay, if the national interest rate went anywhere near 99% we have BIG problems.

CL-protobufs on the Pi

I have cl-protobufs running on SBCL on the Raspberry Pi, but some of the tests don’t pass. I’m not sure if it would work on a 64-bit OS on the Raspberry Pi, I don’t have the inclination to get a 64-bit OS for my Pi. If you do, please tell me what happens!

I wasn’t able to get CCL on arm32 to load cl-protobufs. It gives an error saying it doesn’t have asdf 3.1. Quickloading asdf I get undefined function version<=. If any CCL folk has an idea about what’s going on, please send me a message.

Trying to run ABCL lead me to yet another bug: https://github.com/armedbear/abcl/issues/359

Running Server

My Raspberry Pi is running at: http://65.96.161.53:4242/mortgage-info

Feel free to send either JSON or protobuf messages to the server.

Example JSON:

{
“interest”:3,
“loan_amount”:380000,
“num_periods”:300
}

I don’t know how long I will keep it running. If it goes down and you are interested in sending it messages please send me an email.


Ron, Carl, and Ben edited this post (as usual). Doug provided a great deal of help with SBCL on ARM 32.

Sending Protocol Buffers as an Octet Vector

In our previous posts on using Hunchentoot to send protocol buffer messages we turned them into base64-encoded strings and sent them as parameters in an HTTP post call. This allows us to send multiple protocol buffer messages in a single post call using multiple post parameters. In this post we will show how we can send a single protocol buffer message in the body of a post call as binary data instead of base64 encoding.

Note: I am new to using Hunchentoot, and would have started by sending an octet vector in the body of a post call if he knew how. On review the last blog post Carl Gay asked why this method wasn’t used, and the answer was due to lack of knowledge. After learning that one could use the `hunchentoot:raw-post-data` to access the post body I was able to write this simpler method.

Hello-world-client

The changes from our previous post where we turned our octet-vectors into base64 encoded strings to this post where we just send the octet vector can be found here.

ASD file

Since we are sending an octet-vector we no longer need to worry about flexi-streams, cl-base64, and protobuf-utilities. We removed them from the asd file. 

Implementation

This change is a dramatic simplification to our post call. All we have to do is use drakma to call our web server, setting the :content-type to application/octet-stream and :content to the serialized proto message. Since we assume the web server will be sending us application/octet-stream data we can deserialize the reply response and be one our way.

(response
           (cl-protobufs:deserialize-from-bytes
            'hwp:response
            (drakma:http-request
             address
             :content-type "application/octet-stream"
             :content
(cl-protobufs:serialize-to-bytes
proto-to-send)))

Hello-world-server

The changes from our previous post where we turned our base64 encoded strings into octet-vectors to this post where we just read the octet vector can be found here.

ASD file

Since we are sending an octet-vector we no longer need to worry about protobuf-utilities. We removed this from the asd file. 

Implementation

This change is a dramatic simplification to our post handler. First we set hunchentoot:content-type* to application/octet-stream so it knows we will return an octet-vector. Then we call raw-post-data and deserialize the result. We do our application logic and create our response. Finally we serialize our reply proto and return the octet-vector. 
The one gotcha in all of this is the inability to either send or receive the empty octet-vector. Either drakma just sends nil, or hunchentoot receives the octet stream as nil. Care should be taken to make sure one doesn’t try to deserialize nil, as that’s a type error. W all know nil is not of type octet-vector!

(define-easy-handler (hello-world :uri "/hello") ()
  (setf (hunchentoot:content-type*)
"application/octet-stream")
  (let* ((post-request (raw-post-data))
         (request
(if post-request
               (cl-protobufs:deserialize-from-bytes
                'hwp:request post-request)
                 (hwp:make-request)))
         (response
(hwp:make-response
            :response
             (if (hwp:request.has-name request)
                 (format nil "Hello ~a"
(hwp:request.name request))
                 "Hello"))))
    (cl-protobufs:serialize-to-bytes response)))

Final Thoughts

Sending and receiving protocol buffers through octet-vectors is a simpler way of using cl-protobufs with hunchentoot than trying to use HTTP parameters. Anyone using protocol-buffers will probably send and receive only one message at a time (or wrap multiple messages in one message) so it should be considered the canonical use case. This is how gRPC works. 

I hope you enjoyed this series on cl-protobufs, and hope you enjoy adding it into your own toolbox of useful Lisp packages.


I would like to thank Carl Gay for taking the time to edit the post and provide information on Hunchentoot Web Server.

Serializing and Deserializing Protobuf Messages for HTTP

So far, I’ve made two posts creating an HTTP client  which sends and receives protocol buffer messages and an HTTP Server that accepts and respond with protocol buffer messages. In both of these posts we had to do a lot of extra toil in serializing protocol buffers into base64-encoded strings and deserialize protocol buffers from base64-encoded strings. In this post we create three functions and a macro to help us serialize and deserialize protocol buffers in our http server and client.

Notes:

I will be discussing the Hello World Server and Hello World Client. If you missed those blog posts it may be useful to go and view them here and here. There has been code drift since those posts, mainly the changes we will discuss in this post. The source code for the utility functions can be found in my protobuf-utilities code repo on github.

Code Discussion

This time we will omit the discussion of the asd files. We went through the asd files line-by-line in the two posts referenced in the notes so please look at those.

In addition to the main macros we discuss and show below, we use two helper functions deserialize-proto-from-base64-string and serialize-proto-to-base64-string which can be found in my protobuf-utilities repo.

Server-Side

We noticed a large part of the problem with using cl-protobufs protocol buffer objects in an HTTP request and response is the tedium of translating from the base64-encoded string that was sent into the server to a protocol buffer and then reversing the protocol buffer with the response object. We know which parameters to our HTTP handler will be either nil or a base64-encoded proto packed in a string and their respective types. With this we can make a macro to translate the strings to their respective protos and use them in an enclosing lexical scope.

Why a macro? Many Lispers may not ask this question, but we should as macros are harder to reason about than functions. We want the body of our macro to run in a scope where it has access to all of the deserialized protobuf messages. We are creating a utility that will work for all lists of proto messages so long as we know their types. We could with effort and ugliness make a function that accepts a function, and have that outer function funcall the inner function, but it would be ugly. With a macro we can create new syntax which will simplify code, allowing us to simply list the protobuf messages we wish to deserialize and then use them.

Given that, what our macro should accept is obvious, a list of conses each containing the variable that holds an encoded proto and the type of message to be encoded/decoded. We also take a body in which the supplied symbols will now refer to a deserialized proto.

(defmacro with-deserialized-protos 
  (message-message-type-list &body body)
  "Take a list (MESSAGE . PROTO-TYPE) 
MESSAGE-MESSAGE-TYPE-LIST where the message will be 
a symbol pointing to a base64-encoded serialized proto 
in a string. Deserialize the protos and store them in 
the message symbols. The messages are bound lexically 
so after this macro finishes the protos return to be 
serialized base64-encoded strings."
  `(let ,(loop for (message . message-type) 
            in  message-message-type-list
               collect
               `(,message 
                   (deserialize-proto-from-base64-string
                      ',message-type
                      (or ,message ""))))
     ,@body))

It is plausible that our HTTP server will respond with a base64-encoded protocol buffer object. We could first call `with-deserialized-protos` to do some processing, creating a new protocol buffer object, and then call a function like `serialize-proto-to-base64-string`. Instead I create a macro that will automatically serialize to string then base64-encode the result of a body.

(defmacro serialize-result (&body body)
  (let ((result-proto (gensym "RESULT-PROTO")))
    `(let ((,result-proto ,@body))
       (serialize-proto-to-base64-string ,result-proto))))

Since we’ve gone this far, we can string these two macros together:

(defmacro with-deserialized-protos-serializing-return 
  (message-message-type-list &body body)
  `(serialize-result (with-deserialized-protos 
                       ,message-message-type-list ,@body)))

This vastly improves our handler:

(define-easy-handler (hello-world :uri "/hello")
    ((request :parameter-type 'string))
  (pu:with-deserialized-protos-serializing-return 
     ((request . hwp:request))
    (hwp:make-response
     :response
     (if (hwp:request.has-name request)
         (format nil "Hello ~a" (hwp:request.name request))
         "Hello"))))

A final pro-macro argument: Macros allow us to make syntax that describes what we want a region of code to accomplish. The macros I wrote aren’t distinctly necessary, you could just call `deserialize-proto-from-base64-string` several times in a let binding. Since you probably only have one request proto that would do find. You could also deserialize the return proto yourself. I find the written macros makes the code nicer to write, the downside is people working on the code will have to know what these macros do. Thankfully, we have M-x and docstrings for that.

Client-Side

We have the reverse story on the client side. We start by having to serialize and base64-encode our proto object before sending them over the wire, and then deserialize the result. One would imagine writing the same kind of macro here as we wrote on the server side. The problem with that is there’s no real body we want to run with our serialized protos we want to send over the wire, and we get one proto back so we can just serialize the HTTP result proto object and let bind it. We can just use a function for this.

(defun proto-call 
    (call-name-proto-list return-type address)
  (let* ((call-name-serialized-proto-list
           (loop for (call-name .  proto) 
              in call-name-proto-list
                 for ser-proto 
               = (pu:serialize-proto-to-base64-string proto)
                 collect
                 (cons call-name ser-proto)))
         (call-result
           (or (drakma:http-request
                address
                :parameters call-name-serialized-proto-list)
               "")))
    (pu:deserialize-proto-from-base64-string return-type 
       call-result)))

Final Remarks

In this blog post we implemented several helper macros and a function for working with protocol-buffer objects in an HTTP environment. I believe the macros in protobuf-utilities are the missing link that will make cl-protobufs a welcome addition to Common Lisp HTTP servers.

Pull requests are always welcome


I would like to thank @rongut, @cgay, and @benkuehnert for their edits and comments.

Over-Engineering FizzBuzz

The main language I use in my day-to-day programming life is Common Lisp. It’s a wonderful language with some very powerful tools that most other languages don’t have. How many other languages have the powerful macro system of Lisp? How about generic functions? Not many.


Side note: A generic function is a function you can have many different version of which use a type system to determine which version should be called. This isn’t completely true, but good enough for what I’m writing here.


With this much power we can write code more complex than it ever should be. Let’s use FizzBuzz for an example. The goal of FizzBuzz is to print the numbers from 1 to 100 where if the number is divisible by 3 we print “Fizz”, if it’s divisible by 5 we print “Buzz
and if it’s divisible by 3 and 5 we print “FizzBuzz”. It’s a classical interview problem and now an interview trope.

First, let’s do a simple macro example. I don’t want recursion or multiple function calls or loop iteration in my code, in this example. So I can make a macro that will unroll into a sequence of print statements.

(defmacro stupid-fizz-buzz (c)
  (cond ((> c 100) ())
        ((zerop (mod c 15))
          `(progn
             (print "FizzBuzz")
             (stupid-fizz-buzz ,(1+ c))))
        ((zerop (mod c 3))
          `(progn
             (print "Fizz")
             (stupid-fizz-buzz ,(1+ c))))
        ((zerop (mod c 5))
          `(progn
             (print "Buzz")
             (stupid-fizz-buzz ,(1+ c))))
        (t
          `(progn
             (print ,c)
             (stupid-fizz-buzz ,(1+ c))))))

Changing 100 to 3 and calling macroexpand-all on (stupid-fizz-buzz 1) we get:

(PROGN (PRINT 1) (PROGN (PRINT 2)
  (PROGN (PRINT "Fizz") NIL)))

There are nicer ways to write stupid-fizz-buzz as a macro, but this is a dead simple way.

Also calling (let ((n 1)) (stupid-fizz-buzz n)) won’t work because n isn’t an integer at the time of macro expansion, so some care must be taken. In order for the macro to work the input must be an integer at time of macro-expansion. To fulfill the problem we could give the below inlined function and we should see the unrolled code wherever we call fizz-buzz in our code after compilation.

(declaim (inline fizz-buzz))
(defun fizz-buzz ()
(stupid-fizz-buzz 1))

Perhaps you believe one function should print “Fizz” and another function should print “Buzz”. Also, you love generic functions.

(defparameter *fizz* 3)
(defparameter *buzz* 5)
(defparameter *up-to* 100)

(defgeneric %stupid-fizz-buzz (count))

(defmethod %stupid-fizz-buzz :before ((count integer))
  (when (zerop (mod count *fizz*))
    (format t "Fizz")))

(defmethod %stupid-fizz-buzz :before ((count rational))
  (when (zerop (mod count *buzz*))
    (format t "Buzz")))

(defmethod %stupid-fizz-buzz (count)
  (if (or (zerop (mod count *fizz*))
          (zerop (mod count *buzz*)))
      (format t "~%")
      (format t "~a~%" count))
  (when (< count *up-to*)
    (%stupid-fizz-buzz (1+ count))))

(defun stupid-fizz-buzz ()
  (%stupid-fizz-buzz 1))

 

Here, integer is more exact than rational so the "Fizz" will occur before the "Buzz". Either way, entirely over-engineered… At least the macro version has the benefit of complete loop unrolling.


How should fizz buzz be done.

(defun fizz-buzz ()
  (loop for i from 1 to 100 do
    (when (zerop (mod i 3))
      (format t "Fizz"))
    (when (zerop (mod i 5))
      (format t "Buzz"))
    (if (or (zerop (mod i 3))
            (zerop (mod i 5))))
        (format t "~%")
        (format t "a~%" i))))

Is probably what it should be, you can do a bit nicer if you understand the format directive better.


I hope you had fun in this silly post.

Shout out to @cgay for some spelling errors and the note about the macro not working for (let ((n 1)) (stupid-fizz-buzz n)) .

There’s more examples: https://www.reddit.com/r/lisp/comments/59ikqm/the_most_elegant_implementation_of_fizzbuzz/

Finally, Little one:

img_20200514_130303

Working From Home, 2020

Welcome to an update of working from home, 2020 Covid edition. For those who are new to this blog, or who forgot, I did a post about working from not at work a long while ago: Working From Away. With Covid-19, most programmers are working from home, including me, so I thought it was a good time to discuss working from home again.

img_20200326_120154
And teaching from home!

The major difference this time, is you shouldn’t travel. In my previous post I said I liked working outside the house, usually in a coffee shop. During a pandemic, that’s probably a really bad idea. I would suggest supporting your favorite coffee shop. You probably want to drink there again after the pandemic ends, but you should really choose to stay home.

I’m going to assume you work for a company, and your not a self-employed contractor. If you are a self-employed contractor, you probably already work from home. Good for you!

As a programmer, you probably have a laptop that was issued by your company. This is probably the machine you will be working with. Depending on the company, you may ssh into a physical machine located inside of your company or you may do your work directly on that machine. I don’t find the difference to matter, though I do all of my programming on Emacs in terminal mode.


1. You may choose to work solely with your laptop.

I personally don’t like this scenario. The screen is to small, not enough screen real estate to view all of my code.

2. You may have a monitor.

I have a 27′ monitor. I find it’s a good size, I can easily have two or three side-by-side Emacs buffers up. I also use my laptop screen for web-browsing.

3. You may have more then one extra monitor.

I find two screens to be optimal, one for code and one for non-code. You may like three. I hear Bill Gates prefers his three monitor setup.


Now, where to work. Some people like to sit on a couch. This is terrible for your back, please don’t do this to much.

My wife likes to sit at our dining table. This is fine for small bouts of work, but I don’t want to work at my dining table, I prefer to eat there.

I recently purchased a desk from IKEA: https://www.ikea.com/us/en/p/micke-desk-black-brown-10244743/ . If I had more space I would have ordered a larger desk, but it barely fits!

img_20200311_203327
My daughter trying to build my desk.

img_20200312_163417
Really is the perfext size.

Make sure you have a decent chair.

With a little one, you should also be able to wall yourself away. If I wasn’t in a separate room, little one would never let me get work done! If you don’t have a little one, I would still suggest having a separate room. It gives your life distance from your work.


One last note: Please stay at home. Don’t go to your favorite coffee shop! Feel free to support them though. Help support your local hospitals.

Please read:

https://medium.com/@tomaspueyo/coronavirus-act-today-or-people-will-die-f4d3d9cd99ca

View at Medium.com

Don’t forget to have fun.

img_20200323_183227
And eat bacon, lots of bacon.