You are on page 1of 18

How a hedge fund uses MongoDB

Roman Shtylman Athena Capital Research

Making money in the stock market.


1. Listen to market data 2. ???? 3. Profit!

Agenda
About Athena Capital Research 3 uses of MongoDB at Athena Dropcopy BSON Logging Realtime Monitoring Wrap-Up Questions

Athena Capital Research


Strong focus on technical talent and technology 90% of employees come from engineering, math, or hard science backgrounds Quantitative investment manager math Automated trading robots C++ speed Open source stack freedom

MongoDB at Athena
Lots of unstructured data Many sources of data Want to be able to query quickly Not everything goes into a database Avoid creating schema after schema

Dropcopy
Third parties require near-real-time reporting of trading activity Accounting Risk management Compliance Exchanges provide a "drop-copy" FIX protocol Scrub the messages and forward to said third party MongoDB for message passing

FIX Protocol
Financial Information eXchange Key/value based ASCII Header + body + trailer Key is numeric (maps to some "standard" name) Value is string Good fit for MongoDB Key / value Flexible document sizes easier to query than SQL alternatives

Architecture
We have incoming FIX session (drop copy) Need to have outgoing FIX session MongoDB acts as the glue (message passing layer) 1. Incoming drop copy -> FIX log file 2. fix2json 3. MongoDB 4. Tail cursor 5. Client

Drop side
C++ client application for the drop copy connection Known system and can be kept database free QuickFix fix2json Tail reading of output FIX log files Easy to represent fix as json and subsequently bson Keep db inserts independent of FIX connection Downsides of combining Re-population Data will not be resent

MongoDB setup
Capped collection Natural index Data is purged daily using a simple MongoDB shell script Important to keep tabs on the data size if your data requirements change often Mitigated intraday if you are constantly reading Critical if you want full replay Easy to reconcile with Drop FIX logs

Outgoing side
C++ FIX application QuickFix Tail cursor Handling restarts Select only required fields Filter and alter any field before sending Outgoing message log in FIX Easily handle different clients

Benefits
Full copy of incoming data for querying Aggregation queries Easy replay Client disconnects Easy verification

BSON Logging
Event logging Independent of std::cout Relevant for tracking down problems and keeping records Logging time is "wasted" time Previous logging solution was slow XML based String conversions XML is easy to read after logging

BSON Benefits
Binary with loose document format Defined by the app during logging Internal data format for MongoDB mongorestore Exists sequentially in flat files Easily rendered as json Numbers: original XML implementation: 1k ops/s improved XLM implementation: 3k ops/s first pass BSON implementation: ~20k ops/s current BSON implementation: ~30k ops/s

BSON Gotchas
BSON timestamp type is int64_t milliseconds BSON not a standalone library Highly coupled to MongoDB c++ driver Like MongoDB, schema-less Just something to remember if creating post-processing tools

Realtime Monitoring
Log entries are similar to one another Some can have extra fields Each machine contains independent logs Each log could be a different format Daemon to read and insert into MongoDD Central location, no hunting when problems happen Real-time monitoring and alerting Human intervention required Web based tools to "tail" view log entries WebSockets

Wrap-Up
"Realtime" is relative Benchmark to meet your needs Disjoint pieces can be less prone to failure Other MongoDB uses Contribute to LuaMongo driver BSON code contributions Bugfixes

Questions?
shtylman@athenacr.com Reference: http://www.mongodb.org/display/DOCS/Tailable+Cursors FIX: http://en.wikipedia. org/wiki/Financial_Information_eXchange http://www.quickfixengine.org/ http://www.onixs.biz/tools/fixdictionary/ BSON: http://bsonspec.org/

You might also like