Treechat

A full stack, peer-to-peer OCaml chat server.

Project overview.

What do people want most? A highly efficient chat server made entirely in a functional programming language of course! In that case, look no further, this is a pure OCaml chat server for peer-to-peer messaging. I primarily used Jane Street’s OCaml Async module and a B+ Tree data structure for highly efficient message retrieval. 

 

The interface.

Let’s start with the user interface, Client.ml. This file is run by itself in its own terminal and does not need to have access to any other code through the module system. The client architecture revolves around two main functions: 'repl' and 'do’'. Essentially, 'repl' prompts the user for commands, and 'do’' executes those commands.

Client, implemented as a finite-state machine, iterates 'repl'. Each time 'repl' indexes, the internal state of the execution marches forward until it reaches ‘Repl’. Because the Client file uses the Async library, most of the functionality is monadic in nature. Therefore, any time Client.ml wants to communicate with the server, it must wait to receive a response.

After Client receives a user’s message it pushes it to the pipe, Chat.ml.

 

The pipe.

Chat.ml, the data stream pipe, uses the interface of Client.ml to communicate with the B+ Tree, for message storage and lookup, through the use of asynchronous pipes.

It accomplishes this by creating a TCP Server which then calls a repl-esque handler function to deal with connections to the server. This handler uses pushback to read input and write output to the clients making request through the pipes. The pushback guarantees that the pipe does not overflow, and it also ensures seamless coordination in the communication between the server and its clients. After this is handled, Chat pushes the data to the B+ Tree.

 

The memory.

A B+ tree was used as the primary data structure to store messages within unique conversations. For those that aren’t familiar with the B+ Tree data structure, here is a great explanation. Additionally, given the relative ratio of inserting new data into the database compared to the need to remove data, it was found more useful to implement a Persistent B+ (unable to delete data) tree after researching the different styles in modern databases.

When the B+ Tree receives the user’s message from the pipe, it stores that message in the unique conversation pointer created during conversation inception. It may seem bloated to be unable to delete conversation histories, but the B+ Plus can store n*(n + 1)^(L−1)** pointers at a depth of L, for example our tree can hold a 3,500 unique conversations at a depth of only 4. Thus, the longest path to each node is guaranteed to be log​10(​n). This is a highly efficient data retrieval structure that can store large amounts of data such as frequent chats, with unique keys, between numerous users of the chat system.

**n = order of tree (pointers possible in each node, in this case it is 7)

 


Codebase can be found here.