San Francisco is a fascinating place: The people, the geography, the food/drink, and the huge number of conferences going on. Last week I met folk at the airport attending HCI through to Maths events. This trip however was for the MarkLogic event, and as I'd attended the User Conference in 2011, I knew roughly what to expect.
Meet and greet
OverStory's team has close affinity with MarkLogic. I was a customer for many years, Ron helped build MarkLogic itself, and Craig has worked with them, and been to every MarkLogic conference since they started! It was good for us to meet up, as we're often on different continents, and to reconnect with various members of the MarkLogic organisation.
The bar was the place to congregate on Monday evening, and there we met David Gorbet, who was practising for his keynote the next day. Over beer, he told us what he thought were the killer improvements in MarkLogic version 9, and we all laughed as he mistook me for a long-time MarkLogic employee Olav Schwering (a German now living in Australia). Strewth mate!
David's presentations are always entertaining, and the next day, this was no different, with his approach to how MarkLogic can free us from the schema limitations of the past. Take the Red Pill illustrated how the semantic data contained in the relationships, actually has meaning, and therefore value. However this data isn't present in the RDBMS approach.
Christopher Lindblad was quite open about the battle between open source offerings and commercial products such as MarkLogic. "Go open source and you do it yourself" was his take on this in his QA session.
Gary Bloom's focus in his The World's Best Database for integrating Data From Silos Keynote was on silo busting and the fact that 360 degree data needed by organisations needs to be joined up. Silos are difficult to avoid (e.g. company mergers if nothing else!), and we need an approach that will manage all of the data, whether it's HR, marketing, financial, analytical or content/product based.
Matt Turner (CTO of Media & Publishing) is so full of energy! He's organised bike rides at all the MarkLogic events he's been involved with, and is certainly no slouch, monstering Hawk Hill and the climbs to Muir Woods, which were the 2 organised rides I joined in on. He can flick out his phone and confirm routes faster than I can say "Tour of California". Then after a sprint back into town, he's on stage hosting a session on how MarkLogic has been used to leverage Metadata on the platforms underpinning NBCUniversal, Warner Bros. and the Production in the Cloud project at USC.
This was a very interesting session too, and illustrated how not so long ago, films were on, err, Film. Now they're digital, the amount of metadata they create is massive, with everything from product placements, actor information and camera data through to analytics.
MarkLogic Version 9: A broad offering
Continuing from what we discovered from David, Joe Pasqua offered Wednesday's keynote and summarised the features in MarkLogic version 9. It appeared that this version's improvements are very broad, and offer an increasingly mature set of features. Notably the Security, Data privacy and Operations improvements will attract many. We were very focussed on the Optics API which includes several new index types, and provides aggregates over documents. This should make it easier to fetch data from datasets without having to continually reprocess on new requirements.
Other notables included in MarkLogic V9:
- Entity services can provide rest interfaces to the entities.
- Enhanced SQL and ODBC, allowing relational queries over documents using standard SQL
- Data movement SDK, Get data in/out the way you want, especially relational
- Camel and spark reference implementation
- Security: Encryption moving into the core of MarkLogic
- Separate key management control
- Role based security to document level
- Ops Director, a GUI tool for managing the system
- Telemetry comms channel to MarkLogic support
- Rolling upgrades providing zero application downtime
- Non disruptive operations concept
- Bi temporal improvements
- Tiered storage improvements
- More detail in the Geospatial data
- Cloud improvements. Azure, Google, AWS
As with other parallel events of this type, seeing all the content is difficult, and I'll be attending some of the sessions at the London event to fill in this gap.
Let me or any of the OverStory team know if you're going too. Boris bikes this time.