Kids Agency

Agency, noun; action or intervention, especially such as to produce a particular effect.

I’m a big believer that kids should be given as much agency as possible, given their age and ability. There is a movement called Free Range Kids fighting back against the continual coddling and oversight of kids. We should be very supportive of this. In this post I’ll describe some of what I think should be allowed.

Bikes

Being a kid is hard, you are continually being told what you can and cannot do, you can’t transport yourself anywhere, and you live in an adult world. The first taste of freedom you are likely to get is on a bike. You suddenly go from a walking or running speed, to a much faster biking speed. Your parents can no longer feasibly keep up with you, and you have the ability to explore your world.

When I was a kid my bike was how I got around. Living in rural Vermont, the only way to get anywhere was on a bike (or a car if you’re old enough). Getting to friends was nearly impossible on foot. My bike let me get to my friends, the local park, my school, and the town’s convenience store. We’ll discuss more about letting kids run around alone later.

Now that I have kids, I want them to learn how to ride. Not only is it good exercise, and a lot of fun, but it will also serve as their first vehicle to get around town and see their friends. Right now I want them to be in my eyesight, but Lyra already loves riding her bike to the playground. Faye now has a balance bike but it will be a few months before she grasps riding.

Favorite kids bikes:
Woom 1: https://us.woombikes.com/products/1
Woom 2: https://us.woombikes.com/products/2

Playing Without Parents

When kids are old enough, they need their own space to be themselves. They need to be able to order their own lives, to run by themselves, and just be themselves. With parents constantly over their shoulder, they will never be able to learn about themselves.

When I was a kid, I lived in rural Vermont with many acres of land around me. My parents always said “There’s a large forest out back, go play.” Around 7 my friends and I would ride around Huntington, there was a playground near the gravel pit, and a convenience store a few miles away. We would ride down and get root beer, candy, or some other snack. So long as I told my parents where I was going, they were fine.

Now I live in a suburb of Boston MA, and everyone is scared of everyone else. People seem to not want kids to be playing by themselves. But I ask: “Whats the point of having a backyard if I can’t tell her to go out back and play?” She’s old enough to know to stay out back. In a few years (2, 3?) she should have no problem with the 5 minute walk down the street to the park.

Note: The street I lived in in Huntington had cars, we were smart enough to get out of the way.

Being Alone

Sometimes Lyra decides she wants to play alone. Maybe Faye is getting in her space, maybe Mama and Dada are being too belligerent, but she needs her own time. She goes into her room, or the playroom, and cleans, or plays, or reads. This is important for development, allowing her to self calm, self direct, and just be herself. Parents need to give this to kids.

Outro

Kids are near infinitely capable. The bounds that they have, are often the bounds we set upon them. Before thinking about what you’re comfortable with, what fears you have, think about what your child needs, what they can handle, and what kind of agency you want them to have. All this said, you should do what you feel is right for your kids!

Evil Lies about Hash Tables

Greetings readers. There are lies being told in Computer Science, lies you probably believe. Today I want to discuss the lie that is the constant time lookup or insertion of our friend the hash table. Don’t get me wrong, I love hash tables, some of my best functions use them, but they are far from constant time.

So, what the usual statement is: Hash tables have constant time lookups.

This is a lie. They have average case constant time lookups, worst case linear time lookups, and O is about worst case analysis. 

Let’s dig into this statement.

Faye trying to understand HashTables.

How do hash tables work?

Hash tables are key-value pairs, you tell them to store as keys some object type (for example strings), and as their value some other objects (for example a Person object).

This way we can have:

(let ((persons (make-hash-table :test #'equalp)))
  (setf (gethash “Lyra” persons) (list :name "Lyra" :height 36 :weight 34))
  (setf (gethash “Lrya” persons) (list :name "Lrua" :height 34:weight 36))
  (setf (gethash “Faye” persons) (list :name "Faye" :height 24 :weight 20))
  (print (gethash “Lyra” persons))
  …)

So `persons` is a hash table containing a list of person attributes.

Their power is in hashing, the usage of a function that maps an object to an integer. Take strings as an example, a simple hash function could be

(defun hash-string (str)
  (reduce #'+ str :key #'char-int))

Which adds up the integer values of each character, where "a” is 97, “b” is 98…

When we run

 (gethash “Lyra” persons)

First we hash the string “Lyra” to get 408. The constant time lookup type we know of is arrays, so as one would expect hash tables could (and should) be backed by arrays. 

This leads to two problems:

  1. The possible values of our hash function are bigints, a hash table can’t be backed by an array of arbitrary size.
  2. What if two keys have the same value?

The first is simple, we limit the hash table to a certain size, so it has m possible values. Then we just take (mod (string-hash “foo”) m) to get the array index.

The second is also fairly simple, we create a list of elements. So “Ly” and “yL” would be in the same array index, or in hash-table parlance bucket. A value in this bucket will be the key-value pair, so in the bucket containing “Lyra” in persons we might see

(list (“Lyra” .  (list :name "Lyra" :height 36 :weight 34))
       (“Lrya” , (list :name "Lrya" :height 34 :weight 36)))

Then we are left with testing the key portion of our hash-table with our test function versus the given key, and finally returning the value.

Side note: If you have a way to compare the elements in your bucket you can use a balanced binary tree instead of a list. This takes the worst case lookup time from O(k) where k is the number of elements in your bucket, to O(log k). Java 8 did this for large bucket sizes in it’s HashMap (at least some implementations).

Why is this not just O(n) or O(log n)?

Well it is, if your hash function is constant (or bad) then you will get lots of hash-collisions, and turn your nice shiny hash-table into a balanced binary tree or a list. But we really care about average lookup times.

If we fix our bucket size m, then hash-tables are back to bad… The good news is we can resize the buckets. This requires us to rehash all of the elements, and make new lists. This could be expensive, but it probably occurs rarely, and proper values for the size of our hash-table can greatly improve things. You can find the math in any CS Data Structures book, or wikipedia:

but with proper rehashing and a good hash function you’re back down to constant time lookups!

So please, don’t tell me your maps are O(1) without at least alluding to the fact that this is not quite O…

Other lie:

Quicksort is not O(n log n).

it’s actually O(n^2), but it almost always outperforms it’s O(n log n) rivals.

Out With 2021, In With 2022

Greetings readers,

I know I don’t post much, but I have many ideas. Maybe someday I’ll find the time to write them!

The year 2021 is practically over, and I feel like a year end blog post is in order. It was quite a year. I say that every year, so maybe every year is quite a year!

The most important part of the year was the birth of little Faye. It’s interesting having your second child, in some ways you’re much more prepared for the work, for the sleeplessness, and for taking care of a baby. In other ways, you’re much less prepared than you were for the first one. You know the best ways to get them to sleep, but then the older sister goes and wakes them up!

This year saw Lyra starting preschool, spending time out of the house without us. For the first time Grandpa Don took care of her alone. For the first time I saw her interact with a very close best friend. In some ways I know she’s still my little girl, only 3, but in so many other ways she seems to be getting too big already.

Last year ended with us buying a house, this year we didn’t get to move in until nearly the end of the year. Even after that, the place was barely liveable until the end of October. It’s still great for Lyra to have a large yard to play in, and big bad wolves (not really) to hunt out back.

Wenwen keeps working at UMass Boston. The hardest thing is finding time to relax, someday we’ll find some.

As for 2022. I don’t imagine the pandemic going away, it will linger on and we’ll be forced to find better ways to live in a world with a pandemic, but we’ll find ways. We have our family. Despite all thats going on, I don’t think there’s been a better time for me.

Jon

Playgrounds

Sorry for the absolute lack of updates, between 2 kids, a home renovation, and maybe some math, there is no time to write technical posts. I am excited to say that a gRPC client for Common Lisp exists, but it currently lacks an ASDF file so thats not completely ready. This is not a Lisp post however.

Today I brought Lyra and Faye to the park. Well, I do that everyday, but I did it today as well. My older daughter (now 3) likes to play with slightly older girls, and she ran off to play. I sat by with Faye, and watched her. She went up to one of the girls and tried to play, but then came back. She asked me if I could ask if she could play with them, so I told her just ask “Can I play with you.” Next thing I saw was them pushing her away and saying “We don’t want to play with a baby.”

I’m not entirely sure what to think. On one side, she has to learn how to navigate a playground. I don’t know what I could do, but be there for her when she came over.

On the plus side her nickname is now Owlet. I said “Goodnight Lyra” and she said “No dada, say goodnight Owlet!” So I said “Goodnight Owlet” and she said “Goodnight dada” with the saddest little Owlet voice.

Cl-Protobufs Enumerations

In the last few posts we discussed family life, and before that we created a toy application using cl-protobufs and the ACE lisp libraries. Today we will dive deeper into the cl-protobufs library by looking at Enumerations. We will first discuss enumerations in Protocol Buffers, then we will discuss Lisp Protocol Buffer enums.

Enums:

Most modern languages have a concept of enums. In C++ enumerations are compiled down to integers and you are free to use integer equality. For example

enum Fish {
 salmon,
 trout,
}

void main {
  std::cout << salmon == 0 << std::endl;
}

Will print true. This is in many ways wonderful: enums compile down to integers and there’s no cost to using them. It is baked into the language! 

Protocol Buffers are available for many languages, not just C++. You can find the documentation for Protocol Buffer enums here: 

https://developers.google.com/protocol-buffers/docs/proto#enum

Each language has its own way to support enumeration types. Languages like C++ and Java, which have built-in support for enumeration types, can treat protobuf enums like any other enum. The above enum could be written (with some caveats) in Protocol Buffer as:

enum Fish {
  salmon = 0;
  trout = 1;
}

You should be careful though, Protoc will give a compile warning that enum 0 should be a default value, so 

enum Fish {
  default = 0;
  salmon = 1;
  trout = 2;
}

Is preferred.

Let’s get into some detail for the two variants of Protocol Buffers in use.

// Example message to use below.
enum Fish {
  default = 0;
  salmon = 1;
  trout = 2;
}

message Meal {
  {optional} Fish fish;
}

The `optional` label will only be written for proto 2.

Proto 2:

In proto 2 we can always tell whether `Meal.fish` was set. If the field has the `required` label then it must be set, by definition. (But the `required` label is considered harmful; don’t use it.) If the field has an `optional` label then we can check if it has been set or not, so again a default value isn’t necessary.

If the enum is updated to:

// Example message to use below.
enum Fish {
  default = 0;
  salmon = 1;
  trout = 2;
  tilapia = 3;
}

and someone sends fish = tilapia to a system where tilapia isn’t a valid entry, the library is allowed to do whatever it wants! In Java it sets it to the first entry, so Meal.fish would be default! 

Proto 3

In proto3 if the value of Meal.fish is not set, calling its accessor will return the default value which is always the zero value. There is no way to check whether the field was explicitly set. A default value (i.e., a name that maps to the value zero) must always be given, else the user will get a compile error.

If the Fish enum was updated to contain tilapia as above, and someone sent a proto message containing tilapia to a system with an older program that had the message not containing tilapia, the deserializer should save the enum value. That is, the underlying data structure should know it received a “3” for the fish field in Meal. How the accessors return this value is language dependent. Re-serializing the message should preserve this “unrecognized” value.

A common example is: A gateway system wants to do something with the message and then forward it to another system. Even though the middle system has an older schema for the Fish message it needs to forward all the data to the downstream system.

Cl-protobufs:

Now that we understand the basics of enumerations, it is important to understand how cl-protobufs records enumeration values

Lisp as a language does not have a concept of enumerations; what it does understand is keywords. Taking fish as above and running protoc we will get (see readme https://github.com/qitab/cl-protobufs/#enums):

(deftype fish ‘(:default :salmon :trout))

(defun fish-to-int (keyword) 
  (ecase keyword
    (:default 0)
    (:salmon 1)
    (:trout 2)))

(defun int-to-fish (int)
  (ecase int
    (0 :default)
    (1 :salmon)
    (2 :trout)))

Looking at the tilapia example, the enum deserializer preserves the unknown field in both proto2 and proto3. Calling an accessor on a field containing an unknown value will return :%undefined-n. So for tilapia we will see :%undefined-3.

Warning: To get this to work properly we have to remove type checks from protocol buffer enumerations. You can set the field value in a lisp protocol buffer message to any keyword you want, but you will get a serialization error when you try to serialize. This was a long discussion internally, but that design discussion could turn into a blog post of its own.

Conclusion:

The enumeration fields in cl-protobufs are fully proto2 and proto3 compliant. To do this we had to remove type checking. As a consumer, it is suggested that you always type check and handle undefined enumeration values in your usage of protocol buffer enums. We give you a deftype to easily check.

I hope you have enjoyed this deep dive into cl-protobuf enums. We strive to remove as many gotchas as possible.


Thanks to Ron and Carl for the continual copy edits and improvements!

Week n With Two Baby

The last few weeks has been about 2 little kids, and a very tired Wenwen. In the last post I discussed how Lyra felt sad no longer being the only child. I thought I’d give an update on how life was going. Again, there is no programming in this post, I haven’t done any programming in the last few weeks, and only a smidgeon of math.

Lyra is feeling much better about being an older sister. She watches for her sister to cry and will run up to me and say “sisters crying”. She has even helped changed a diaper on occasion. She will come up and hug Faye. 

That being said, she does miss having more one-on-one time. When Wenwen finishes teaching, she has to feed Faye. Lyra will come up to me and say “Dada, hold sister”. When Wenwen brings her to bed she will say “Dada tired” so I will join them (despite having to hold Faye instead). She is an amazing sister!

Faye is still in the Eat, Sleep, Poop phase of baby-dom. She has gas, cries, and does all of the normal baby stuff. She still seems like a very quiet newborn, but as everyone tells me, she’s her own self. 

Two weekends ago my mom and aunt Berta came to help us move out of our old condo. We will (someday?) be moving into our new house, but the court is taking forever to okay the sale. Beware probate!

Finally, a giant thanks to my (and Wenwen’s) advisor: Don Hadwin. He came over and stood out the window to talk to Lyra and Faye. He is also letting us stay with him as we attempt to buy a new house. I’m continually indebted and thankful to you. I hope we can do some math soon!

Welcoming Faye, Helping Lyra

Greeting everyone.

This will be the first post since Faye’s birth. I have one more Proto Cache post coming, in fact the code is already up on Github at head, but I don’t have time to write it yet. Technical posts take quite a while to write, then get copy edits and code fixes. It will come, someday…

Faye is doing great. She’s much quieter than Lyra was at her age. As a parent, you tend to worry about everything, but that’s just life as a parent. You truly forget how small newborns are.

I get worried about Lyra. When Wenjing was at the hospital Lyra kept wondering when Mama was going to come home. She said she was excited to see Faye, but she had no idea the change that would happen when we got to the hospital. That first meeting was hard.

Due to Covid, I was able to enter the hospital once, and since I had a Lyra at home I was only there for around 5 hours. When Lyra and I got to the hospital, I carried Lyra inside as she was spooked to be at the hospital. When we got to Faye, I held Faye, put her in the car seat, and sang a quick lullabye. Lyra’s face started scrunching up, and tears quickly followed. It’s tragic seeing a toddler about to cry, knowing what’s coming, but having no way to stop the tears.

Lyra’s gotten over the immediate shock. She helps us change diapers, and talks to Faye. She seems happy to be an older sister. On the other hand, she gets jealous. Wenjing is allowed to be with Faye, Lyra is fine with that. But Lyra seems to think Dada is hers

Lyra never asked for this. Her small world has changed immensely, and unlike adults she has no experience handling change. Toddlers have a hard time controlling their emotions, everything is shown on their sleeve. Even with this, she already loves her little sister. 

Proto Cache: Flags and Hooks

Today’s Updates

Last week we made our Pub/Sub application use protocol buffer objects for most of its internal state. This week we’ll take advantage of that change by setting startup and shutdown hooks to load state and save state respectively. We will add flags so someone starting up our application can set the load and save files on the command line. We will then package our application into an executable with a new asdf command.

Code Changes

Proto-cache.lisp

Defpackage Updates:

We will use ace.core.hook to implement our load and exit hooks. We will show how to make methods that will run at load and exit time when we use this library in the code below. In the defpackage we use the nickname hook. The library is available in the ace.core repository.

We use ace.flag as our command line flag parsing library. This is a command line flag library used extensively at Google for our lisp executables. The library can be found in the ace.flag repository.

Flag definitions:

We define three command line flags:

  • flag::*load-file*
  • flag::*save-file*
  • flag::*new-subscriber* 
    • This flag is used for testing purposes. It should be removed in the future.
  • flag::*help*

The definitions all look the same, we will look at flag::*load-file* as an example:

(flag:define flag::*load-file* ""
  "Specifies the file from which to load the PROTO-CACHE on start up."
  :type string)
  • We use the flag:define macro to define a flag. Please see the code for complete documentation of this macro (REAME.md update coming). We only use a small subset of the ace.flag package.
  • flag::*load-file*: This is the global where the parsed command line flag will be stored.
  • The documentation string to document the flag. If flag:print-help is called this documentation will be printed:

    --load-file (Determines the file to load PROTO-CACHE from on startup)

     Type: STRING

  • :type : The type of the flag. Here we have a string.

We use the symbol-name string of the global in lowercase as the command line input. 

For example:

  1. flag::*load-file* becomes --load-file
  2. flag::*load_file* becomes –load_file

The :name or :names key in the flag:define macro will let users select their own names for the command line input instead of this default.

Main definition:

We want to create a binary for our application. Since we have no way to add publishers and subscribers outside of the repl we define a dummy main that adds publishers and subscribers for us:

(defun main ()
  (register-publisher "pika" "chu")
  (register-subscriber "pika" flag::*new-subscriber*)
  (update-publisher-any
    "pika" "chu"
    (google:make-any :type-url "a"))
  ;; Sleep to make sure running threads exit.
  (sleep 2))

After running the application we can check for a new subscriber URL in the saved proto-cache application state file. I will show this shortly.

Load/Exit hooks:

We have several pre-made hooks defined in ace.core.hook. Two useful functions are ace.core.hook:at-restart and ace.core.hook:at-exit. As one can imagine, at-restart runs when the lisp image starts up, and at-exit runs when the lisp image is about to exit.

The first thing we do when we start our application is parse our command line:

(defmethod hook::at-restart parse-command-line ()
  "Parse the command line flags."
  (flag:parse-command-line)
  (when flag::*help*
    (flag:print-help)))

You MUST call flag:parse-command-line for the defined command line flags to have non default values.

We also print a help menu  if --help was passed in.

Then we can load our proto if the load-file flag was passed in:

(defmethod hook::at-restart load-proto-cache :after parse-command-line  ()
  "Load the command line specified file at startup."
  (when (string/= flag::*load-file* "")
    (load-state-from-file :filename flag::*load-file*)))                                                                                            

We see an :after clause in our defmethod. We want the load-proto-cache method called during start-up but after we have parsed the command line so flag::*load-file* has been properly set. 

Note: The defmethod here uses a special defmethod syntax added in ace.core.hook. Please see the hook-method documentation for complete details.

Finally we save our image state at exit:

(defmethod hook::at-exit save-proto-cache ()
  "Save the command line specified file at exit."
  (when (string/= flag::*save-file* "")
    (save-state-to-file :filename flag::*save-file*)))

The attentive reader will notice our main function never explicitly called any of these hook functions…

Proto-cache.asd:

We add code to build an executable using asdf:

(defpackage :proto-cache …
  :build-operation "program-op"
  :build-pathname "proto-cache"
  :entry-point "proto-cache:main")

This is a program-op. The executable pathname is relative, we save the binary as “proto-cache” in the same directory as our proto-cache code. The entry point function is proto-cache:main.

We may then call: 

sbcl --eval "(asdf:operate :build-op :proto-cache)" 

at the command line to create our binary.

Running our binary:

With our binary built we can call:

./proto-cache  --save-file /tmp/first.proto --new-subscriber http://www.google.com

Trying cat /tmp/first.pb:

pika'
http://www.google.com
a?pika"chujg

These are serialized values so one shouldn’t try to understand the output so much. We can see “http://www.google.com”, “pika”, and “chu” are all saved.

Calling

./proto-cache   --load-file /tmp/first.pb --save-file /tmp/first.pb --new-subscriber http://www.altavista.com

And then cat /tmp/first.pb:

I
pikaA
?http://www.altavista.com
http://www.google.com
a?pika"chujg
“

Finally calling  ./proto-cache  --help

We get:

Flags from ace.flag:

    --lisp-global-flags
     (When provided, allows specifying global and special variables as a flag on the command line.
       The values are NIL - for none, :external - for package external, and T - for all flags.)
     Type: ACE.FLAG::GLOBAL-FLAGS

    --help (Whether to print help) Type: BOOLEAN Value: T

    --load-file (Determines the file to load PROTO-CACHE from on startup)
     Type: STRING
     Value: ""

    --new-subscriber (URL for a new subscriber, just for testing)
     Type: STRING
     Value: ""

    --lisp-normalize-flags
     (When non-nil the parsed flags will be transformed into a normalized form.
       The normalized form contains hyphens in place of underscores, trims '*' characters,
       and puts the name into lower case for flags names longer than one character.)
     Type: BOOLEAN

    --save-file (Determines the file to save PROTO-CACHE from on shutdown)
     Type: STRING
     Value: ""

This shows our provided documentation of the command line flags as expected.

Conclusions:

Today we added command line flags, load and exit hooks, and made our application buildable as an executable. We can build our executable and distribute it as we see fit. We can direct it to load and save the application state to user specified files without updating the code. There is still much to do before it’s done but this is slowly becoming a usable application.

There are a few additions I would like to make, but I have a second child coming soon. This may (or may not) be my last technical blog post for quite some time. I hope this sequence of Proto Cache posts has been useful thus far, and I hope to have more in the future.

Thanks to Ron Gut and Carl Gay for copious edits and comments.

Proto Cache: Saving State

Todays Updates:

In our last post we implemented a basic Pub Sub application that stores an Any protocol buffer message and a list of subscribers. When the Any protocol buffer message gets updated we send the new Any message in the body of an http request to all of the subscribers in the subscribe-list. 

Today we will update our service to save all of the state in a protocol buffer message. We will also add functionality to save and load the state of the Proto Cache application. 

Note: Viewing the previous post is highly suggested!

Code Updates:

Note: We use red to denote removed code and green to denote added code.

pub-sub-details.proto

`syntax = proto3`

We will use proto3 syntax. I’ve yet to find a great reason to choose proto3 over proto2, but I’ve also yet to find a great reason to choose proto2 over proto3. The biggest reason to choose proto3 over proto2 is that most people use proto3, but the Any proto will store proto2 or proto3 messages regardless.

import “any.proto”

Our users are publishing Any messages to their clients, so we must store them in our application state. This requires us to include the any.proto file in our proto file.

message PubSubDetails

This contains (almost) all of the state needed for the publish subscribe service for one user:

  • repeated string subscriber_list
  • google.protobuf.Any current_message
    • This is the latest Any message that the publisher has stored in the Proto Cache.
  • string username
  • string password
    • For any kind of production use this should be salted and hashed. 

message PubSubDetailsCache

This message contains one entry, a map from a string (which will be a username for a publisher) to a PubSubDetails instance. The attentive reader will notice that we save the username twice, once in the PubSubDetails message and once in the PubSubDetailsCache map as the key. This will be explained when we discuss changes to the proto-cache.lisp file.

proto-cache.asd

The only difference in proto-cache.asd from all of the other asd files we’ve seen using protocol buffers is the use of a protocol buffer message in a package different from our current package. That is, any.proto resides in the cl-protobufs package but we are including it in the pub-sub-details.proto file in proto-cache.

To allow the protoc compiler to find the any.proto file we give it a :proto-search-path containing the path to the any.proto file. 


...
    :components
    ((:protobuf-source-file "pub-sub-details"
      :proto-pathname "pub-sub-details.proto"
      :proto-search-path ("../cl-protobufs/google/protobuf/"))
...

Note: We use a relative path: “../cl-protobufs/google/protobuf/”, which may not work for you. Please adjust to reflect your set-up.

We don’t need a component in our defsystem to load the any.proto file into our lisp image since it’s already loaded by cl-protobufs. We might want to just to recognize the direct dependency of the any.proto file. 

proto-cache.lisp

Defpackage updates:

We are adding new user invokable functionality so we export:

  • save-state-to-file
  • load-state-from-file

local-nicknames:

  • cl-protobufs.pub-sub-details as psd
    • This is merely to save typing. The cl-protobufs.pub-sub-details is the package that contains the functionality derived from pub-sub-details.proto.

Globals:

*cache*: This will be a protocol buffer message containing a hash table with string keys and pub-sub-details messages. 

(defvar *cache* (make-hash-table :test 'equal))
(defvar *cache* (psd:make-pub-sub-details-cache))

*mutex-for-pub-sub-details*: Protocol buffer messages can’t store lisp mutexes. Instead, we store the mutex for a pub-sub-details in a new hash-table with string (username) keys.

make-pub-sub-details:

This function makes a psd:pub-sub-details protocol buffer message. It’s almost the same as the previous iteration of pub-sub-details except for the addition of username.


...
  (make-instance 'pub-sub-details :password password))
  (psd:make-pub-sub-details :username username
                            :password password
                            :current-any (google:make-any))
...

(defmethod (setf psd:current-any) (new-value (psd psd:pub-sub-details))

This is really a family of functions:

  • :around: When someone tries to set the current-message value on a pub-sub-details struct we want to write-protect the pub-sub-details entry. We use an around method which activates before any call to the psd:current-any setter. Here we take the username from the pub-sub-details message and write-hold the corresponding mutex in the *mutex-for-pub-sub-details* global hash-table. Then we call call-next-method which will call the main (setf current-any) method.
(defmethod (setf current-any) (new-value (psd pub-sub-details))
(defmethod (setf psd:current-any) :around (new-value (psd psd:pub-sub-details))
  • (setf psd:current-any): This is the actual defmethod defined in cl-protobufs.pub-sub-details. It sets the current-messaeg slot on the message struct.
  • :after: This occurs after the current-any setter was called. We send an http call to all of the subscribers on the pub-sub-details subscriber list. Minus the addition of the psd package prefix to accessor functions of pub-sub-details this function wasn’t changed.

 register-publisher:

The main differences between the last iteration of proto-cache and this one are:

  1. This *-gethash method is exported by cl-protobufs.pub-sub-details so the user can call gethash on the hash-table in a map field of a protocol buffer message.
    • (gethash username *cache*)
    • (psd:pub-sub-cache-gethash username *cache*)
  2. We add a mutex to the *mutex-for-pub-sub-details* hash-table with the key being the username string sent to register-publisher.
  3. We return t if the new user was registered successfully, nil otherwise.

register-subscriber and update-publisher-any:

  1. The main difference here is:
    1. (gethash publisher *cache*)
    2. (psd:pub-sub-cache-gethash publisher *cache*)
  2. We have to use the psd package prefix to all of the accessors to pub-sub-details

save-state-to-file:

(defun save-state-to-file (&key (filename "/tmp/proto-cache.txt"))
  "Save the current state of the proto cache to *cache* global
   to FILENAME as a serialized protocol buffer message."
  (act:with-frmutex-read (*cache-mutex*)
    (with-open-file (stream filename :direction :output
                                     :element-type '(unsigned-byte 8))
      (cl-protobufs:serialize-to-stream stream *cache*))))

This is a function that accepts a filename as a string, opens the file for output, and calls cl-protobufs:serialize-to-stream. This is all we need to do to save the state of our applications!

load-state-from-file:

We need to do three things:

  1. Open a file for reading and deserialize the Proto Cache state saved by save-sate-to-file
  2. Create a new map containing the mutexes for each username.
  3. Set the new state into the *cache* global and the new mutex hash-table in *mutex-for-pub-sub-details*.
    1. We do write-hold the *cache-mutex* but I would suggest only loading the saved state when Proto Cache is started.
(defun load-state-from-file (&key (filename "/tmp/proto-cache.txt"))                                                                                   
  "Load the saved *cache* globals from FILENAME. Also creates                                                                                          
   all of the fr-mutexes that should be in *mutex-for-pub-sub-details*."
  (let ((new-cache
          (with-open-file (stream filename :element-type '(unsigned-byte 8))
            (cl-protobufs:deserialize-from-stream
              'psd:pub-sub-details-cache :stream stream)))
        (new-mutex-for-pub-sub-details (make-hash-table :test 'equal)))
    (loop for key being the hash-keys of (psd:pub-sub-cache new-cache)
          do
             (setf (gethash key new-mutex-for-pub-sub-details)
                   (act:make-frmutex)))
    (act:with-frmutex-write (*cache-mutex*)
      (setf *mutex-for-pub-sub-details* new-mutex-for-pub-sub-details
            *cache* new-cache))))

Conclusion:

The main update we made today was defining pub-sub-details in a .proto file instead of a Common Lisp defclass form. The biggest downside is the requirement to save the pub-sub-details mutex in a separate hash-table. For this cost, we:

  1. Gained the ability to save our application state with one call to cl-protobufs:serialize-to-stream.
  2. Gained the ability to load our application with little more then one call to cl-protobufs:deserialize-from-stream.

We were also able to utilize the setf methods defined in cl-protobufs to create :around and :after methods.

Note: Nearly all services will be amenable to storing their state in protocol buffer messages.

I hope the reader has gained some insight into how they can use cl-protobufs in their application even if their application doesn’t make http-requests. Being able to save the state of a running program and load it for later use is very important in most applications, and protocol buffers make this task simple.

Thank you for reading!

Thanks to Ron, Carl, and Ben for edits!

Proto Cache: Implementing Basic Pub Sub

Today’s Updates

In our last post we saw some of the features of the ace.core.defun and ace.core.thread libraries by creating a thread-safe cache of the Any protocol buffer object. Today we are going to update the proto-cache repository to implement publisher/subscriber features. This will allow a publisher to publish a feed of Any messages and a subscriber to subscribe to such a  feed. 

It is expected (but not required) that the reader has read the previous post Proto Cache: A Caching Story. That post details some of the functions and objects you will see in today’s code.

Note: This is a basic implementation, not one ready for production use. This will serve as our working project going forward.

Code Updates

Proto-cache.asd

We want subscribers to be able to get new versions of an Any protocol buffer message. On the web, the usual way to receive messages is over HTTP. We use the Drakma HTTP client. You can see we added :drakma to the depends-on list in the defsystem.

Proto-cache.lisp

There are three major regions to this code. The first region is the global objects that make up the cache. The second is the definition of a new class, pub-sub-details. Finally the actual publisher-subscriber functions are at the bottom of the page.

Global objects:

The global objects section looks much like it did in our previous post. We update the *cache* hash-table to use equal as its test function and we are going to make the keys to this cache be username strings.

Pub-sub-details class:

The global objects section looks much like it did in our previous post. We update the *cache* hash-table to use equal as its test function and we are going to make the keys to this cache be username strings.

The pub-sub-details class contains the data we need to keep track of the publisher and subscriber features:

  • subscriber-list: This will be a list of the HTTP endpoints to send the Any messages to after the Any message is updated. Currently, we only allow for an HTTP message string. Future implementations should allow for security functionality on those endpoints.
  • current-any: The current Any message that the publisher has supplied.
  • mutex: A fr-mutex to protect the current-any slot. This should be read-held to get the current-any and it should be read-held to set a new current-any message.
  • password: The password for the subscriber held as a string. 

We shouldn’t be saving the password being as a string in the pub-sub-details class. At a minimum we should be salting and hashing this value. In the future we should implement an account system for readers and subscribers giving access to reading and updating the pub-sub-details. As this is only instructional and not production-ready code, I feel okay leaving it as is for the moment.

We create a make-pub-sub-details function that will create a pub-sub-details object with a given password. The register function doesn’t allow the user to set an Any message at creation time, and none of the other slots are useful to the publisher.

We create an accessor method to set the any-message value slot. We also create an :after method to send the Any message to any listening subscribers by iterating through the subscriber list and calling a drakma:http-request. We wrap this in unwind-protect so an IO failure doesn’t stop other subscribers from getting the message.

Finally we add a setter function for the subscriber list.

Function definitions:

Register-publisher:

This function is the registration point for a new publisher. It is almost the same as set-in-cache from our previous post except it checks that an entry in the cache for the soon-to-be-registered publisher doesn’t already exist. It would be bad to let a new publisher overwrite an existing publisher.

Register-subscriber:

Here we use a new macro, ace.core.etc:clet from the etc package in ace.core.

(defun register-subscriber (publisher address)
  "Register a new subscriber to a publisher."
  (ace:clet ((ps-struct
               (act:with-frmutex-read (*cache-mutex*)
                 (gethash publisher *cache*)))
             (ps-mutex (mutex ps-struct)))
    (act:with-frmutex-write (ps-mutex)
      (push address (subscriber-list ps-struct)))))

In the code below we search the cache for a user entry, if the entry is found then ps-struct will be non-nil and we can evaluate the body adding the subscriber to the list. If the subscriber is not found we return nil.

Update-publisher-any:

(defun update-publisher-any (username password any)
  "Updates the google:any message for a publisher
   with a specified username and password.
   The actual subscriber calls happen in a separate thread
   but 'T is returned to the user to indicate the any
   was truly updated."
  (ace:clet ((ps-class
              (act:with-frmutex-read (*cache-mutex*)
                (gethash username *cache*)))
             (correct-password (string= (password ps-class)
                                        password)))
    (declare (ignore correct-password))
    (act:make-thread
     (lambda (ps-class)
       (setf (current-any ps-class) any))
     :arguments (list ps-class))
    t))

In the update-publisher-any code we use clet to verify that the publisher exists and the password is found. We ignore the correct-password entry though.

We don’t want the publisher to be thread-blocked while we send the new message to all of the subscribers so we update the current-any in a separate thread. To do this we use the ace.core.thread function make-thread. A keen reader will see for SBCL this calls the sbcl make-thread function, otherwise it calls bordeaux-threads make-thread function.

If we are able to find a publisher with the correct password we return T to show success.

Conclusion

In today’s post we have made a basic publisher-subscriber library that will send an Any protocol buffer message to a list of subscribers. We have detailed some new functions that we used in ace.core. We have also listed some of the problems with this library. The code has evolved substantially from the previous post but it still has a long way to go before being production-ready.

Thank you for your reading!


Ron Gut, Carl Gay, and Ben Kuehnert gave comments and edits to this post.