Matthew Hodgson

158 posts tagged with "Matthew Hodgson" (See all Author)

Matrix.org homeserver outage (25th Jan 2017)

25.01.2017 00:00 — Tech Matthew Hodgson

Hi folks,

As many will have noticed there was a major outage on the Matrix homeserver for matrix.org last night (UK-time). This impacted anyone with an account on the matrix.org server, as well as anyone using matrix.org-hosted bots & bridges. As Matrix rooms are shared over all participants, rooms with participants on other servers were unaffected (for users on those servers). Here's a quick explanation of what went wrong (times are UTC):

  • 2017-01-24 16:00 - We notice that we're badly running out of diskspace on the matrix.org backup postgres replica. (Turns out the backup box, whilst identical hardware to the master, had been built out as RAID-10 rather than RAID-5 and so has less disk space).
  • 2017-01-24 17:00 - We decide to drop a large DB index: event_push_actions(room_id, event_id, user_id, profile_tag), which was taking up a disproportionate amount of disk space, on the basis that it didn't appear to be being used according to the postgres stats. All seems good.
  • 2017-01-24 ~23:00 - The core matrix.org team go to bed.
  • 2017-01-24 23:33 - Someone redacts an event in a very active room (probably #matrix:matrix.org) which necessitates redacting the associated push notification from the event_push_actions table. This takes out a lock within persist_event, which is then blocked on deleting the push notification. It turns out that this deletion requires the missing DB constraint, causing the query to run for hours whilst holding the transaction lock. The symptoms are that anything reading events from the DB was blocked on the transaction, causing messages not to be relayed to other clients or servers despite appearing to send correctly. Meanwhile, the fact that events are being received by the server fine (including over federation) makes the monitoring graphs look largely healthy.
  • 2017-01-24 23:35 - End-to-end monitoring detects problems, and sends alerts into pagerduty and various Matrix rooms. Unfortunately we'd failed to upgrade the pageduty trial into a paid account a few months ago, however, so the alerts are lost.
  • 2017-01-25 08:00 - Matrix team starts to wake up and spot problems, but confusion over the right escalation process (especially with Matthew on holiday) means folks assume that other members of the team must already be investigating.
  • 2017-01-25 09:00 - Server gets restarted, service starts to resume, although box suffers from load problems as traffic tries to catch up.
  • 2017-01-25 09:45 - Normal service on the homeserver itself is largely resumed (other than bridges; see below)
  • 2017-01-25 10:41 - Root cause established and the redaction path is patched on matrix.org to stop a recurrence.
  • 2017-01-25 11:15 - Bridges are seen to be lagging and taking much longer to recover than expected. Decision made to let them continue to catch up normally rather than risk further disruption (e.g. IRC join/part spam) by restarting them.
  • 2017-01-25 13:00 - All hosted bridges returned to normal.

Obviously this is rather embarrassing, and a huge pain for everyone using the matrix.org homeserver - many apologies indeed for the outage. On the plus side, all the other Matrix homeservers out there carried on without noticing any problems (which actually complicated spotting that things had broken, given many of the core team primarily use their personal homeservers).

In some ways the root cause here is that the core team has been focusing all its energy recently on improving the overall Matrix codebase rather than operational issues on matrix.org itself, and as a result our ops practices have fallen behind (especially as the health of the Matrix ecosystem as a whole is arguably more important than the health of a single homeserver deployment). However, we clearly need to improve things here given the number of people (>750K at the last count) dependent on the Matrix.org homeserver and its bridges & bots.

Lessons learnt on our side are:

  • Make sure that even though we had monitoring graphs & thresholds set up on all the right things... monitoring alerts actually have to be routed somewhere useful - i.e. phone calls to the team's phones. Pagerduty is now set up and running properly to this end.
  • Make sure that people know to wake up the right people anyway if the monitoring alerting system fails.
  • To be even more paranoid about hotfixes to production at 5pm, especially if they can wait 'til the next day (as this one could have).
  • To investigate ways to rapidly recover bridges without causing unnecessary disruption.

Apologies again to everyone who was bitten by this - we're doing everything we can to ensure it doesn't happen again.

Matthew & the team.

Synapse 0.18.7 is out - Please upgrade, especially if on 0.18.5 or 0.18.6.

06.01.2017 00:00 — Tech Matthew Hodgson

Hi all,

TL;DR: Please upgrade to Synapse 0.18.6, especially if you are on 0.18.5 which is a bad release.

TL;DR: Please upgrade to Synapse 0.18.7 - especially if you are on 0.18.5 or 0.18.6 which both have serious federation bugs.

Synapse 0.18.5 contained a really nasty regression in the federation code which causes servers to echo transactions that they receive back out to the other servers participating in a room. This has effectively resulted in a gradual amplification of federation traffic as more people have installed 0.18.5, causing every transaction to be received N times over where N is the number of servers in the room.

We'll do a full write-up once we're happy we've tracked down all the root problems here, but the short story is that this hit critical mass around Dec 26, where typical Synapses started to fail to keep up with the traffic - especially when requests hit some of the more inefficient or buggy codepaths in Synapse.  As servers started to overload with inbound connections, this in turn started to slow down and consume resources on the connecting servers - especially due to an architectural mistake in Synapse which blocks inbound connections until the request has been fully processed (which could require the receiving server in turn to make outbound connections), rather than releasing the inbound connection asap.  This hit the point that servers were running out of file descriptors due to all the outbound and inbound connections, at which point they started to entirely tarpit inbound connections, resulting in a slow feedback loop making the whole situation even worse.

We've spent the last two weeks hunting all the individual inefficient requests which were mysteriously starting to cause more problems than they ever had before; then trying to understand the feedback misbehaviour; before finally discovering the regression in 0.18.5 as the plausible root cause of the problem.  Troubleshooting has been complicated by most of the team having unplugged for the holidays, and because this is the first (and hopefully last!) failure mode to be distributed across the whole network, making debugging something of a nightmare - especially when the overloading was triggering a plethora of different exotic failure modes.  Huge thanks to everyone who has shared their server logs with the team to help debug this.

Some of these failure modes are still happening (and we're working on fixing them), but we believe that if everyone upgrades away from the bad 0.18.5 release most of the symptoms will go away, or at least go back to being as bad as they were before.  Meanwhile, if you find your server suddenly grinding to a halt after upgrading to 0.18.6 0.18.7 please come tell us in #matrix-dev:matrix.org.

We're enormously sorry if you've been bitten by the federation instability this has caused - and many many thanks for your patience whilst we've hunted it down.  On the plus side, it's given us a lot of very useful insight into how to implement federation in future homeservers to not suffer from any of these failure modes.  It's also revealed the root cause of why Synapse's RAM usage is quite so bad - it turns out that it actually idles at around 200MB with default caching, but there's a particular codepath which causes it to spike temporarily by 1GB or so - and that RAM is then not released back to the OS.  We're working on a fix for this too, but it'll come after 0.18.7.

Unfortunately the original release of 0.18.6 still exhibits the root bug, but 0.18.7 (originally released as 0.18.7-rc2) should finally fix this.  Sorry for all the upgrades :(

So please upgrade as soon as possible to 0.18.7. Debian packages are available as normal.

thanks,

Matthew

Changes in synapse v0.18.7 (2017-01-09)

  • No changes from v0.18.7-rc2

Changes in synapse v0.18.7-rc2 (2017-01-07)

Bug fixes:

  • Fix error in rc1's discarding invalid inbound traffic logic that was incorrectly discarding missing events

Changes in synapse v0.18.7-rc1 (2017-01-06)

Bug fixes:

  • Fix error in #PR 1764 to actually fix the nightmare #1753 bug.
  • Improve deadlock logging further
  • Discard inbound federation traffic from invalid domains, to immunise against #1753

Changes in synapse v0.18.6 (2017-01-06)

Bug fixes:

  • Fix bug when checking if a guest user is allowed to join a room - thanks to Patrik Oldsberg (PR #1772)

Changes in synapse v0.18.6-rc3 (2017-01-05)

Bug fixes:

  • Fix bug where we failed to send ban events to the banned server (PR #1758)
  • Fix bug where we sent event that didn't originate on this server to other servers (PR #1764)
  • Fix bug where processing an event from a remote server took a long time because we were making long HTTP requests (PR #1765, PR #1744)
Changes:
  • Improve logging for debugging deadlocks (PR #1766, PR #1767)

Changes in synapse v0.18.6-rc2 (2016-12-30)

Bug fixes:

  • Fix memory leak in twisted by initialising logging correctly (PR #1731)
  • Fix bug where fetching missing events took an unacceptable amount of time in large rooms (PR #1734)

Changes in synapse v0.18.6-rc1 (2016-12-29)

Bug fixes:

  • Make sure that outbound connections are closed (PR #1725)

The Matrix Holiday Special! (2016 edition)

26.12.2016 00:00 — General Matthew Hodgson

We seem to have fallen into the pattern of giving seasonal 'state of the union' updates on the Matrix blog, despite best intentions to blog more frequently... although given the Autumn Update ended up being posted in November this one is going to be a relatively incremental update.  Let's jump straight in:

E2E Encryption

Unless you've been in a coma for the last month you'll have hopefully noticed that we launched the formal beta for E2E Encryption across matrix-{'{'}js,ios,android{'}'}-sdk (and thus Riot/{'{'}Web, iOS, Android{'}'}) in November, complete with the successful independent public security assessment of our Olm and Megolm cryptography library from NCC Group.  So far the beta has gone well in parts: the core Olm/Megolm crypto library has held up well with no bugfixes at all required since the audit (yay!).  However, we've hit a lot of different edge cases in the wild where devices can fail to share their outbound session ratchet state to other devices present in the room.  This results in the infamous "Unknown Inbound Session ID" (UISI) errors which many folks will have seen (now renamed to the more meaningful "Unable to decrypt: The sender's device has not sent us the keys for this message" error).

Unfortunately there's a bunch of entirely different causes for this, both platform-specific and cross-platform, and we've been running around untangling all the error reports and getting to the bottom of it.  The good news is that we think we now know the vast majority of the causes, and fixes are starting to land.  We've also just finished a fairly time-consuming formal crypto code-review on the three application SDK implementations (JS, iOS & Android) to shake out any other issues.  Meanwhile some new features have also landed - e.g. the ability for guests to use E2E!  The remaining stuff at this point before we can consider declaring E2E out of beta is:

  • Finish fixing the UISI errors (in progress)
  • Warn when unverified devices are added to a room
  • Implement passphrased backup & restore for E2E state, so that folks can avoid losing their E2E history when they logout or switch to a new device
  • Improve device verification.
Thanks to everyone who's been using E2E and reporting issues - given the number of different UISI error causes out there, it's been really useful to go through the different bug reports that folks have submitted.  Please continue to submit them when you see unexpected problems (especially over the coming months as stability improves!)

New Projects!

There have been a tonne of new projects popping up from all over the place since the last update.  Looking at the git history of the projects page, we've been adding one every few days!  Highlights include:

Bridges:

Clients

Other projects

Bots and Bridges

There's been a bunch of work from the core team on bots & bridges infrastructure over the last month:

Rearchitecting the slack and gitter bridges to optionally support 'puppeting users'.  This is in some ways the ultimate flavour of bridging - where you authenticate with the remote service using your "real" gitter/slack/... credentials, and then the bridge has access to synchronise your full spectrum of data with Matrix.  This is in contrast to the current implementations where the bridge creates virtual users (e.g. Slack webhook bots or IRC virtual user bots) or uses a predefined bot (e.g. matrixbridge on gitter) to link the rooms.

This has some huge advantages: e.g. ability to bridge Slack and Gitter DMs through properly to Matrix; bridging presence and typing notifications correctly, not requiring any custom bots or integrations to be configured; not proxying via a crappy bridge bot as per gitter today; letting Matrixed users be completely indistinguishable from their native selves on the remote platform - so supporting tab complete in Slack, profiles, presence, etc.  The main disadvantage is that you have to have an account on the platform already (although you could argue this is a feature, especially from the remote network's perspective!) and that you are delegating access to that account through to the bridge, so you'd better trust it.  However, you can always run your own bridge if trust is an issue.

The work on this is mid-progress currently, but we're really looking forward to seeing the official Slack, Gitter and other bridges support this mode of operation in the new year!

We've also been spending some time working with bridges written by the wider community (e.g. Half-Shot's twitter bridge) to get them deployed on the matrix.org homeserver itself, to help folks who can't run their own.

Meanwhile there's been a lot of work going into supporting the IRC bridge. Main highlights there are:

  • The release of matrix-appservice-irc 0.7, with all sorts of major new features
  • Turning on bi-directional membership list syncing at last for all networks other than Freenode!  In theory, at least, you should finally see the same list of room members in both IRC and Matrix!!
  • Handling IRC PM botspam from Freenode and OFTC, which bridge through as invite spam into Matrix.  Sorry if you've been bitten by this.  We've worked around it for now by setting appropriate umodes on the IRC bots, and by implementing a 'bulk reject' button on Riot (under in Settings).  This caused a few nasty outages on Freenode and OFTC. On the plus side, at least it shows that Riot scales up to receiving 2000+ invites without exhibiting ill effects...
  • Considering how to improve history visibility on IRC to avoid scenarios where channel history is shared between users in the same room (even if their IRC bot has temporarily disconnected).  This was a major problem during the Freenode/OFTC outages mentioned earlier.
Last but not least, we've just released gomatrix - a new official Matrix client SDK for golang!  Go-neb (the reference golang Matrix bot framework) has been entirely refactored to use gomatrix, which should keep it honest as a 1st class Matrix client SDK for those in the Golang community.  We highly recommend all Golang nuts to go read the documentation and give it a spin!

Riot Desktop

Riot development has been largely preoccupied with E2E debugging in the respective Matrix Client SDKs, but 0.9.3 was released last week adding in Electron-based desktop app support.  (Remember, if you hate Electron-style desktop apps which provide a desktop app by embedded a webbrowser, you can always use another Matrix client!).  If you've been missing having Riot as a proper desktop app, go get involved!

screen-shot-2016-12-26-at-01-00-12

Next Generation Homeservers

Ruma

Ruma is a project led by Jimmy Cuadra to build a Matrix homeserver in Rust - the project has been ploughing steadily onwards through 2016 with a bit of an acceleration during December.  You can follow progress at the excellent This Week in Ruma blog, watching the project on Github, and tracking the API status dashboard.  Some of the latest PRs are looking very promising in terms of getting the core remaining CS APIs working, e.g:

Needless to say, we've been keeping an eye on Ruma with extreme interest, not least as some of the Matrix core team are rabid Rustaceans too :)  We can't wait to see it exposing a usable CS API in the hopefully not-too-distant future!!

Dendrite

Meanwhile, in the core team, we've been doing some fairly serious experimentation on next-generation homeservers.  Synapse is in a relatively stable state currently, and we've implemented most of the horizontal scalability tricks available to us there (e.g. splitting out worker processes).  Instead we're starting to hit some fundamental limitations of the architecture: the fact that the whole codebase effectively assumes that it's talking to a single consistent database instance; python's single-threadedness and memory inefficiency; twisted's lack of profiling; being limited to sqlite's featureset; the fact that the schema has grown organically and is difficulty to refactor aggressively; the fact the app papers over SQL problems by caching everything in RAM (resulting in synapse's high RAM requirements); the constant bugs caused by lack of type safety; etc.

We started an experiment in Golang to fix some of this a year ago in the form of Dendron - a "strangler pattern" homeserver skeleton intended to sit in front of a synapse and slowly port endpoints over to Go.  In practice, Dendron ended up just being a rather dubious Matrix-aware loadbalancer, and meanwhile no endpoints got moved into it (other than /login, which then got moved out again due to the extreme confusion of having to maintain implementations in both Dendron & Synapse).  The main reasons for Dendron's failure are a) we had enough on our hands supporting Synapse; b) there were easier scalability improvements (e.g. workers) to be had on Synapse; c) the gradual migration approach looked like it would end up sharing the same storage backend as Synapse anyway, and potentially end up inheriting a bunch of Synapse's woes.

So instead, a month or so ago we started a new project codenamed Dendrite (aka Dendron done right ;D) - this time an entirely fresh standalone Golang codebase for rapid development and iteration on the platonic ideal of a next-generation homeserver (and an excuse to audit and better document & spec some of the murkier bits of Matrix).  The project is still very early and there's no doc or code to be seen yet, but it's looking cautiously optimistic (especially relative to Dendron!).  The project goals are broadly:

  1. To build a new HS capable of supporting the exponentially increasing load on matrix.org ASAP (which is currently at 600K accounts, 50K rooms, 5 messages/s and growing fast).
  2. To architecturally support full horizontal scalability through clustering and sharding from the outset - i.e. no single DBs or DB writer processes.
  3. To optimise for Postgres rather than be constrained by SQLite, whilst still aiming for a simple but optimal schema and storage layer.  Optimising for smaller resource footprints (e.g. environments where a Postgres is overkill) will happen later - but the good news is that the architecture will support it (unlike Synapse, which doesn't scale down nicely even with SQLite).
It's too early to share more at this stage, but thought we should give some visibility on where things are headed!  Needless to say, Synapse is here for the foreseeable - we think of it as being the Matrix equivalent of the role Apache httpd played for the Web.  It's not enormously efficient, but it's popular and relatively mature, and isn't going away.  Meanwhile, new generations of servers like Ruma and Dendrite will come along for those seeking a sleeker but more experimental beast, much as nginx and lighttpd etc have come along as alternatives to Apache.  Time will tell how the server ecosystem will evolve in the longer term, but it's obviously critical to the success of Matrix to have multiple active independent server implementations, and we look forward to seeing how Synapse, Ruma & Dendrite progress!

2017

Looking back at where we were at this time last year, 2016 has been a critical year for Matrix as the ecosystem has matured - rolling out E2E encryption; building out proper bot & bridge infrastructure; stabilising and tuning Synapse to keep up with the exponential traffic growth; seeing the explosion of contributors and new projects; seeing Riot edging closer to becoming a viable mainstream communication app.

2017 is going to be all about scaling Matrix - both the network, the ecosystem, and the project.  Whilst we've hopefully transitioned from being a niche decentralisation initiative to a relatively mainstream FOSS project, our ambition is unashamedly to become a mainstream communication (meta)network usable for the widest possible audience (whilst obviously still supporting our current community of FOSS & privacy advocates!).  With this in mind, stuff on the menu for 2017 includes:

  • Getting E2E Encryption out of beta asap.
  • Ensuring we can scale beyond Synapse - see Dendrite, above.
  • Getting as many bots and bridges into Matrix as possible, and doing everything we can to support them, host them and help them be as high quality as possible - making the public federated Matrix network as useful and diverse as possible.
  • Supporting Riot's leap to the mainstream, ensuring Matrix has at least one killer app.
  • Adding the final major missing features:
    • Customisable User Profiles (this is almost done, actually)
    • Groups (i.e. ability to define groups of users, and perform invites, powerlevels etc per-group as well as per-user)
    • Threading
    • Editable events (and Reactions)
  • Maturing and polishing the spec (we are way overdue a new release)
  • Improving VoIP - especially conferencing.
  • Reputation/Moderation management (i.e. spam/abuse filtering).
  • Much-needed SDK performance work on matrix-{'{'}react,ios,android{'}'}-sdk.
  • ...and a few other things which would be premature to mention right now :D
This is going to be an incredibly exciting ride (right now, it feels a bit like being on a toboggan which has made its way onto a fairly steep ski slope...) and we can only thank you: the community, for getting the project to this point - whether you're hacking on Matrix, contributing pull requests, filing issues, testing apps, spreading the word, or just simply using it.

See you in 2017, and thanks again for flying Matrix.

  • Matthew, Amandine & the Matrix Team.

matrix-appservice-irc 0.7.0 is out!

19.12.2016 00:00 — Tech Matthew Hodgson

Also, we've just released a major update to the IRC bridge codebase after trialling it on the matrix.org-hosted bridges for the last few days.

The big news is:

  • The bridge uses Synapse 0.18.5's new APIs for managing the public room list (improving performance a bunch)
  • Much faster startup using the new /joined_rooms and /joined_members APIs in Synapse 0.18.5
  • The bridge will now remember your NickServ password (encrypted at rest) if you want it to via the !storepass command
  • You can now set arbitrary user modes for IRC clients on connection (to mitigate PM spam etc)
  • After a split, the bridge will drop Matrix->IRC messages older than N seconds, rather than try to catch the IRC room up on everything they missed on Matrix :S
  • Operational metrics are now implemented using Prometheus rather than statsd
  • New !quit command to nuke your user from the remote IRC network
  • Membership list syncing for IRC->Matrix is enormously improved, and enabled for all matrix.org-hosted bridges apart from Freenode.  <b>At last, membership lists should be in sync between IRC and Matrix; please let us know if they're not</b>.
  • Better error logging
For full details, please see the changelog.

With things like NickServ-pass storing, !quit support and full bi-directional membership list syncing, it's never been a better time to run your own IRC bridge.  Please install or upgrade today from https://github.com/matrix-org/matrix-appservice-irc!

Synapse 0.18.5 released!

19.12.2016 00:00 — Tech Matthew Hodgson

Hi folks,

We released synapse 0.18.5 on Friday.  This is mainly about fixing performance problems with the unread room counts and the public room directory; polishing the E2E endpoints based on beta feedback; and general minor bits and bobs.

Get it whilst it's (almost) hot from https://github.com/matrix-org/synapse!  Changelog follows:

Changes in synapse v0.18.5 (2016-12-16)

Bug fixes:

  • Fix federation /backfill returning events it shouldn't (PR #1700)
  • Fix crash in url preview (PR #1701)

Changes in synapse v0.18.5-rc3 (2016-12-13)

Features:

  • Add support for E2E for guests (PR #1653)
  • Add new API appservice specific public room list (PR #1676)
  • Add new room membership APIs (PR #1680)
Changes:
  • Enable guest access for private rooms by default (PR #653)
  • Limit the number of events that can be created on a given room concurrently (PR #1620)
  • Log the args that we have on UI auth completion (PR #1649)
  • Stop generating refresh_tokens (PR #1654)
  • Stop putting a time caveat on access tokens (PR #1656)
  • Remove unspecced GET endpoints for e2e keys (PR #1694)
Bug fixes:
  • Fix handling of 500 and 429's over federation (PR #1650)
  • Fix Content-Type header parsing (PR #1660)
  • Fix error when previewing sites that include unicode, thanks to @kyrias (PR #1664)
  • Fix some cases where we drop read receipts (PR #1678)
  • Fix bug where calls to /sync didn't correctly timeout (PR #1683)
  • Fix bug where E2E key query would fail if a single remote host failed (PR #1686)

Changes in synapse v0.18.5-rc2 (2016-11-24)

Bug fixes:

  • Don't send old events over federation, fixes bug in -rc1.

Changes in synapse v0.18.5-rc1 (2016-11-24)

Features:

  • Implement "event_fields" in filters (PR #1638)
Changes:
  • Use external ldap auth package (PR #1628)
  • Split out federation transaction sending to a worker (PR #1635)
  • Fail with a coherent error message if /sync?filter= is invalid (PR #1636)
  • More efficient notif count queries (PR #1644)

When Ericsson discovered Matrix...

23.11.2016 00:00 — General Matthew Hodgson

As something completely different, we've invited Stefan Ålund and his team at Ericsson to write a guest blog post about the really cool stuff Ericsson is doing with Matrix.  This is a fascinating glimpse into how major folks are already launching commercial products on top of Matrix - whilst also making significant contributions back to the projects and the community.  We'd like to thank Stefan and Ericsson for all their support and perseverance, and we wish them the very best with the Ericsson Contextual Communication Cloud!

-- Matthew

At the end of 2014, my colleague Adam Bergkvist and I attended the WebRTC Expo in Paris, partly to promote our Open Source project OpenWebRTC, but also to meet the rest of the European WebRTC community and see what others were working on.

At Ericsson Research we had been working on WebRTC for quite some time. Not only on the client-side framework and how those could enable some truly experimental stuff, but more importantly how this emerging technology could be used to build new kinds of communication services where communication is not the service (A calling B), but is integrated as part of some other service or context. A simple example would be a health care solution, where the starting point could be the patient records and communication technologies are integrated to enable remote discussions between patients and their doctors.

Our research in this area, that we started calling “contextual communication”, pointed in a different direction from Ericsson's traditional communication business, therefore making it hard for us to transfer our ideas and technologies out from Ericsson Research. We increasingly had the feeling that we needed to build something new and start from a clean slate, so to speak.

Some of our guiding principles:

  • Flexibility - the communication should be able to integrate anywhere
  • Fast iterations - browsers and WebRTC are moving targets
  • Open - interoperability is important for communication systems
  • Low cost - operations for the core communication should approach 0
  • Trust - build on the Ericsson brand and technology leadership
We had a pretty good idea about what we wanted to build, but even though Ericsson is a big company, the team working in this area was relatively small and also had a number of other commitments that we couldn't abandon.

I think that is enough of a background, let's circle back to the WebRTC Expo and the reason why I am writing this post on the Matrix blog.

Adam and I were pretty busy in our booth talking to people and giving demos, so we actually missed when Matrix won the Best Innovation Award. Nonetheless we finally got some time to walk around and I started chatting with Matthew and Amandine who were manning the Matrix booth. Needless to say, I was really impressed with their vision and what they wanted to build. The comparison to email and how they wanted to make it possible to build an interoperable bridge between communication “islands”, all in an open (source) manner, really appealed to me.

To be honest, the altruistic aspects of decentralising communication was not the most important part for us, even if we were sympathetic to the cause, working for a company that was founded from "the belief that communication is a basic human need". We ultimately wanted to build a new kind of communication offering from Ericsson, and it looked like Matrix might be able to play a part in that.

I had recently hired a couple of interns and as soon as I came back from Paris, we set them to work evaluating Matrix. They were quickly able to port an existing WebRTC service (developed and used in-house) to use Matrix signalling and user management. We initially had some concerns about the maturity of the reference Home Server implementation (remember, this was almost 2 years ago) and we didn't want to start developing our own since we were still a small team. However, Matthew and the rest of the Matrix team worked closely with us, helping to answer all our (dumb) questions and we finally got to a point where we had the confidence to say “screw it, let's try this and see if it flies”. ?

Ericsson had recently launched the Ericsson Garage where employees could pitch ideas for how to build new business. So we decided to give the process a try and presented an idea on how Ericsson could start selling contextual communication as-a-Service, directly to enterprises that wanted help integrating communication into their business processes, but didn't necessarily have the competence or business interest to run their own communication services. We got accepted and moved (physically) out of Research to sit in the Garage for the next 4 months, developing a MVP.

Since the primary interface to our offering would be through SDKs on various platforms, we decided early on to develop our own. The SDKs were implementing the standard Matrix specification, but we put a lot of time in increasing the robustness and flexibility in the WebRTC call handling and eventually with added peer-2-peer data and collaboration features, on top of the secure WebRTC DataChannel. On the server side, our initialconcerns about Synapse were eventually removed completely as the Matrix team relentlessly kept working on fixing performance issues, patching security holes and provided a story on how to scale. Over the years we have contributed with several patches to Synapse (SAML auth and auth improvements; application service improvements) and provided input to the Matrix specification. We have always found the Matrix team be very inclusive and easy to work with.

The project graduated successfully from the Ericsson Garage and moved in to Ericsson's Business Unit IT & Cloud Products, where we started to increase the size of the team and just last month signed a contract with our first customer. We call the solution Ericsson Contextual Communication Cloud, or ECCC for short, and it can be summarised on a high level by the following picture:

ECCC in a nutshell

If you are interested in ECCC, feel free to reach out at https://discuss.c3.ericsson.net

As with any project developed in the open, it is essential to have a healthy community around it. We have received excellent support from the Matrix project and they have always been open for discussion, engaged our developers and listened to our needs. We depend on Matrix now and we see great potential for the future. We hope that others will adopt the technology and help make the community grow even stronger.

  • Stefan Ålund and the Ericsson ECCC Team

Synapse 0.18.4

22.11.2016 00:00 — Tech Matthew Hodgson

Uncharacteristically, we're actually remembering to announce a new release of Synapse!

Major performance fixes on federation, as well as the changes required to support E2E encrypted attachments (yay!)

Please install or upgrade from https://github.com/matrix-org/synapse :)

Changes in synapse v0.18.4 (2016-11-22)

Bug fixes:

  • Add workaround for buggy clients that the fail to register (PR #1632)

Changes in synapse v0.18.4-rc1 (2016-11-14)

Changes:

  • Various database efficiency improvements (PR #1188, #1192)
  • Update default config to blacklist more internal IPs, thanks to Euan Kemp @euank (PR #1198)
  • Allow specifying duration in minutes in config, thanks to Daniel Dent @DanielDent (PR #1625)
Bug fixes:
  • Fix media repo to set CORs headers on responses (PR #1190)
  • Fix registration to not error on non-ascii passwords (PR #1191)
  • Fix create event code to limit the number of prev_events (PR #1615)
  • Fix bug in transaction ID deduplication (PR #1624)

Matrix’s ‘Olm’ End-to-end Encryption security assessment released - and implemented cross-platform on Riot at last!

21.11.2016 00:00 — General Matthew Hodgson

TL;DR: We're officially starting the cross-platform beta of end-to-end encryption in Matrix today, with matrix-js-sdk, matrix-android-sdk and matrix-ios-sdk all supporting e2e via the Olm and Megolm cryptographic ratchets.  Meanwhile, NCC Group have publicly released their security assessment of the underlying libolm library, kindly funded by the Open Technology Fund, giving a full and independent transparent report on where the core implementation is at. The assessment was incredibly useful, finding some interesting issues, which have all been solved either in libolm itself or at the Matrix client SDK level.

If you want to get experimenting with E2E, the flagship Matrix client Riot has been updated to use the new SDK on Web, Android and iOS… although the iOS App is currently stuck in “export compliance” review at Apple. However, iOS users can mail support@riot.im to request being added to the TestFlight beta to help us test!  Update: iOS is now live and approved by Apple (as of Thursday Nov 24.  You can still mail us if you want to get beta builds though!)

We are ridiculously excited about adding an open decentralised e2e-encrypted pubsub data fabric to the internet, and we hope you are too! :D


Ever since the beginning of the Matrix we've been promising end-to-end (E2E) encryption, which is rather vital given conversations in Matrix are replicated over every server participating in a room.  This is no different to SMTP and IMAP, where emails are typically stored unencrypted in the IMAP spools of all the participating mail servers, but we can and should do much better with Matrix: there is no reason to have to trust all the participating servers not to snoop on your conversations.  Meanwhile, the internet is screaming out for an open decentralised e2e-encrypted pubsub data store - which we're now finally able to provide :)

Today marks the start of a formal public beta for our Megolm and Olm-based end-to-end encryption across Web, Android and iOS. New builds of the Riot matrix client have just been released on top of the newly Megolm -capable matrix-js-sdk, matrix-ios-sdk and matrix-android-sdk libraries .  The stuff that ships today is:

  • E2E encryption, based on the Olm Double Ratchet and Megolm ratchet, working in beta on all three platforms.  We're still chasing a few nasty bugs which can cause ‘unknown inbound session IDs', but in general it should be stable: please report these via Github if you see them.
  • Encrypted attachments are here! (limited to ~2MB on web, but as soon as https://github.com/matrix-org/matrix-react-sdk/pull/562 lands this limit will go away)
  • Encrypted VoIP signalling (and indeed any arbitrary Matrix events) are here!
  • Tracking whether the messages you receive are from ‘verified' devices or not.
  • Letting you block specific target devices from being able to decrypt your messages or not.
  • The Official Implementor's Guide.  If you're a developer wanting to add Olm into your Matrix client/bot/bridge etc, this is the place to start.
Stuff which remains includes:
  • Speeding up sending the first message after adds/removes a device from a room (this can be very slow currently - e.g. 10s, but we can absolutely do better).
  • Proper device verification.  Currently we compare out-of-band device fingerprints, which is a terrible UX.  Lots of work to be done here.
  • Turning on encryption for private rooms by default.  We're deliberately keeping E2E opt-in for now during beta given there is a small risk of undecryptable messages, and we don't want to lull folks into a false sense of security.  As soon as we're out of beta, we'll obviously be turning on E2E for any room with private history by default.  This also gives the rest of the Matrix ecosystem a chance to catch up, as we obviously don't want to lock out all the clients which aren't built on matrix-{'{'}js,ios,android{'}'}-sdk.
  • We're also considering building a simple Matrix proxy to aid migration that you can run on localhost that E2Es your traffic as required (so desktop clients like WeeChat, NaChat, Quaternion etc would just connect to the proxy on localhost via pre-E2E Matrix, which would then manage all your keys & sessions & ratchets and talk E2E through to your actual homeserver.
  • Matrix clients which can't speak E2E won't show encrypted messages at all.
  • ...lots and lots of bugs :D .  We'll be out of beta once these are all closed up.
In practice the system is working very usably, especially for 1:1 chats.  Big group chats with lots of joining/parting devices are a bit more prone to weirdness, as are edge cases like running multiple Riot/Webs in adjacent tabs on the same account.  Obviously we don't recommend using the E2E for anything mission critical requiring 100% guaranteed privacy whilst we're still in beta, but we do thoroughly recommend everyone to give it a try and file bugs!

In Riot you can turn it on a per-room basis if you're an administrator that room by flipping the little padlock button in Room Settings.  Warning: once enabled, you cannot turn it off again for that room (to avoid the race condition of people suddenly decrypting a room before someone says something sensitive):

screen-shot-2016-11-21-at-15-21-15

The journey to end-to-end encryption has been a bit convoluted, with work beginning in Feb 2015 by the Matrix team on Olm: an independent Apache-licensed implementation in C/C++11 of the Double Ratchet algorithm designed by Trevor Perrin and Moxie Marlinspike ( https://github.com/trevp/double_ratchet/wiki - then called ‘axolotl').  We picked the Double Ratchet in its capacity as the most ubiquitous, respected and widely studied e2e algorithm out there - mainly thanks to Open Whisper Systems implementing it in Signal, and subsequently licensing it to Facebook for WhatsApp and Messenger, Google for Allo, etc.  And we reasoned that if we are ever to link huge networks like WhatsApp into Matrix whilst preserving end-to-end encrypted semantics, we'd better be using at least roughly the same technology :D

One of the first things we did was to write a terse but formal spec for the Olm implementation of the Double Ratchet, fleshing out the original sketch from Trevor & Moxie, especially as at the time there wasn't a formal spec from Open Whisper Systems (until yesterday! Congratulations to Trevor & co for publishing their super-comprehensive spec :).  We wrote a first cut of the ratchet over the course of a few weeks, which looked pretty promising but then the team got pulled into improving Synapse performance and features as our traffic started to accelerate faster than we could have possibly hoped.  We then got back to it again in June-Aug 2015 and basically finished it off and added a basic implementation to matrix-react-sdk (and picked up by Vector, now Riot)… before getting side-tracked again.  After all, there wasn't any point in adding e2e to clients if the rest of the stack is on fire!

Work resumed again in May 2016 and has continued ever since - starting with the addition of a new ratchet to the mix.  The Double Ratchet (Olm) is great at encrypting conversations between pairs of devices, but it starts to get a bit unwieldy when you use it for a group conversation - especially the huge ones we have in Matrix.  Either each sender needs to encrypt each message N times for every device in the room (which doesn't scale), or you need some other mechanism.

For Matrix we also require the ability to explicitly decide how much conversation history may be shared with new devices.  In classic Double Ratchet implementations this is anathema: the very act of synchronising history to a new device is a huge potential privacy breach - as it's deliberately breaking perfect forward secrecy.  Who's to say that the device you're syncing your history onto is not an attacker?  However, in practice, this is a very common use case.  If a Matrix user switches to a new app or device, it's often very desirable that they can decrypt old conversation history on the new device.  So, we make it configurable per room.  (In today's implementation the ability to share history to new devices is still disabled, but it's coming shortly).

The end result is an entirely new ratchet that we've called Megolm - which is included in the same libolm library as Olm.  The way Megolm works is to give every sender in the room its own encrypted ratchet (‘outbound session'), so every device encrypts each message once based on the current key given by their ratchet (and then advances the ratchet to generate a new key).  Meanwhile, the device shares the state of their ‘outbound session' to every other device in the room via the normal Olm ratchet in a 1:1 exchange between the devices.  The other devices maintain an ‘inbound session' for each of the devices they know about, and so can decrypt their messages.  Meanwhile, when new devices join a room, senders can share their sessions according to taste to the new device - either giving access to old history or not depending on the configuration of the room.  You can read more in the formal spec for Megolm.

We finished the combination of Olm and Megolm back in September 2016, and shipped the very first implementation in the matrix-js-sdk and matrix-react-sdk as used in Riot with some major limitations (no encrypted attachments; no encrypted VoIP signalling; no history sharing to new devices).

Meanwhile, we were incredibly lucky to receive a public security assessment of the Olm & Megolm implementation in libolm from NCC Group Cryptography Services - famous for assessing the likes of Signal, Tor, OpenSSL, etc and other Double Ratchet implementations. The assessment was very generously funded by the Open Technology Fund (who specialise in paying for security audits for open source projects like Matrix).  Unlike other Double Ratchet audits (e.g. Signal), we also insisted that the end report was publicly released for complete transparency and to show the whole world the status of the implementation.

NCC Group have released the public report today - it's pretty hardcore, but if you're into the details please go check it out.  The version of libolm assessed was v1.3.0, and the report found 1 high risk issue, 1 medium risk, 6 low risk and 1 informational issues - of which 3 were in Olm and 6 in Megolm.  Two of these (‘Lack of Backward Secrecy in Group Chats' and ‘Weak Forward Secrecy in Group Chats') are actually features of the library which power the ‘configurable privacy per-room' behaviour mentioned a few paragraphs above - and it's up to the application (e.g. matrix-js-sdk) to correctly configure privacy-sensitive rooms with the appropriate level of backward or forward secrecy; the library doesn't enforce it however.  The most interesting findings were probably the fairly exotic Unknown Key Share attacks in both Megolm and Olm - check out NCC-Olm2016-009 and NCC-Olm2016-010 in the report for gory details!

Needless to say all of these issues have been solved with the release of libolm 2.0.0 on October 25th and included in today's releases of the client SDKs and Riot.  Most of the issues have been solved at the application layer (i.e. matrix-js-sdk, ios-sdk and android-sdk) rather than in libolm itself.  Given the assessment was specifically for libolm, this means that technically the risks still exist at libolm, but given the correct engineering choice was to fix them in the application layer we went and did it there. (This is explains why the report says that some of the issues are ‘not fixed' in libolm itself).

Huge thanks to Alex Balducci and Jake Meredith at NCC Group for all their work on the assessment - it was loads of fun to be working with them (over Matrix, of course) and we're really happy that they caught some nasty edge cases which otherwise we'd have missed.  And thanks again to Dan Meredith and Chad Hurley at OTF for funding it and making it possible!

Implementing decentralised E2E has been by far the hardest thing we've done yet in Matrix, ending up involving most of the core team.  Huge kudos go to: Mark Haines for writing the original Olm and matrix-js-sdk implementation and devising Megolm, designing attachment encryption and implementing it in matrix-{'{'}js,react{'}'}-sdk, Richard van der Hoff for taking over this year with implementing and speccing Megolm, finalising libolm, adding all the remaining server APIs (device management and to_device management for 1:1 device Olm sessions), writing the Implementor's Guide, handling the NCC assessment, and pulling together all the strands to land the final implementation in matrix-js-sdk and matrix-react-sdk.  Meanwhile on Mobile, iOS & Android wouldn't have happened without Emmanuel Rohée, who led the development of E2E in matrix-ios-sdk and OLMKit (the iOS wrappers for libolm based on the original work by Chris Ballinger at ChatSecure - many thanks to Chris for starting the ball rolling there!), Pedro Contreiras and Yannick Le Collen for doing all the Android work, Guillaume Foret for all the application layer iOS work and helping coordinate all the mobile work, and Dave Baker who got pulled in at the last minute to rush through encrypted attachments on iOS (thanks Dave!).  Finally, eternal thanks to everyone in the wider community who's patiently helped us test the E2E whilst it's been in development in #megolm:matrix.org; and to Moxie, Trevor and Open Whisper Systems for inventing the Double Ratchet and for allowing us to write our own implementation in Olm.

It's literally the beginning for end-to-end encryption in Matrix, and we're unspeakably excited to see where it goes.  More now than ever before the world needs an open communication platform that combines the freedom of decentralisation with strong privacy guarantees, and we hope this is a major step in the right direction.

-- Matthew, Amandine & the whole Matrix team.

Further reading:

The Matrix Autumn Special!

12.11.2016 00:00 — In the News Matthew Hodgson

Another season has passed; the leaves are dropping from the trees in the northern hemisphere (actually, in the time it's taken us to finish this post, most of them have dropped :-/) and once again the Matrix team has been hacking away too furiously to properly update the blog. So without further delay here's an update on all things Matrix!

Synapse 0.18

Back in September, we forgot to properly announce the 0.18 release of Synapse! This is a major oversight given that 0.18 was a huge update with some critical performance improvements, but hopefully everyone has upgraded by now anyway. If not, there's never been a better time to run your own homeserver! The main improvement is that the Matrix room state updates are now stored as deltas in the database rather than snapshots, which reduces the size of the database footprint by around 5 - 7x. The first time you run synapse after upgrading to 0.18 it will go through your database deleting all the historical data, after which you can VACUUM the db to reclaim the freed diskspace.

You can tell when it's finished based on whether it's stopped logging about the 'background_deduplicate_state' task. There was a bug in 0.18.0 that meant this process was very slow (weeks) on sqlite DBs and chewed 100% CPU; this was fixed in 0.18.1, and subsequently we've also had 0.18.2 (various perf and bug fixes, and a new modular internal API for authentication) and the current release: 0.18.3 to address a major vulnerability on deployments using LDAP with obsolete versions (0.9.x) of the python ldap3 library - e.g. Debian Stable. Folks using the Debian Stable packages must upgrade immediately.

Other big changes in Synapse 0.18 were:

  • Adding the final APIs required to support end-to-end encryption: specifically, a new store-and-forward API called "to device messaging", which lets messages be passed between specific devices outside the context of a room or a room DAG. This is used for exchanging authentication tokens and sensitive end-to-end key data between devices (e.g. when a new device joins a room and needs to be looped in) and is not intended for general messaging.
  • Changing how remote directory servers are queried. Rather than constantly spidering them via the secondary_directory_servers option (which was causing a load crisis on the matrix.org server, as everyone else in Matrix kept polling it for directory updates), clients can now set a 'server' parameter on the publicRooms request to ask their server to proxy the request through to a specific remote server. Element (the app formally known as Riot/Web) implements this already. This is a stopgap until we have a proper global room discovery database of some kind.
  • Adding pagination support to the room directory API. We now have enough rooms in Matrix that downloading the full list every time the user searches for a room was getting completely untenable - we now support paginating and searching the list. Riot (now Element) and Riot-Android (now Element Android) are using the new APIs already.
  • Basic support for 'direct room' semantics. When you create a room you can now state the intent for that room to be a 1:1 with someone via the is_direct parameter.
  • Making the /notifications API work - this lets clients download a full list of all the notifications a user has been recently sent (highlights, mentions etc)
Spec for all of these new APIs are currently making their way into the official matrix spec; you can see the work in progress at https://matrix.org/speculator. Meanwhile, we're waiting for the last bits of the end-to-end encryption APIs to land there before releasing 0.3 of the Matrix spec, which should happen any day now.

To find out more and get upgraded if you haven't already, please check out the full changelog.

Synapse scalability

Something which we've been quietly adding over the last 6 months is support for running large synapse deployments like the Matrix.org homeserver. Matrix.org has around 500K accounts on it, 50k rooms, and relays around 500K messages per day and obviously the community expects it to have good performance and availability (even though we'd prefer if you ran your own server, for obvious reasons!)

The current scaling approach for this is called 'Workers' - where we've split out a whole bunch of different endpoints from the main Synapse process into child 'worker' processes which replicate their state from the master Synapse process. These workers are designed to scale horizontally, adding as many as you like to handle the traffic load. It's not full active/active horizontal scalability in that you're still limited by the performance of the master process and the database master you're writing to, but it's a great way to escape Python's global interpreter lock limiting processes effectively to a single core, and in practice it's a huge improvement and works pretty well as of Synapse 0.18.

You can read more about the architecture and how to run your Synapse in worker-mode over at https://github.com/matrix-org/synapse/blob/master/docs/workers.rst.

Starting a Riot (now Element)

Meanwhile, the biggest news in Matrixland has probably been the renaming of Vector as Riot (now Element) and the 'mass market' launch of Riot as a flagship Matrix client at the MoNage conference on Sept 19th in Boston. The reasons for renaming Vector have been done to death by now and hopefully folks have got over the shock, but the rationale is to have a more distinctive and memorable (and controversial!) name, which is more aligned with the idea of returning control of communication back to the people :) Amandine has the full story over at the Riot blog.

Riot (now Element) itself is a fairly thin layer on top of the official client Matrix SDKs, and so 95% of the work for Riot (now Element) took the form of updates to matrix-js-sdk, matrix-react-sdk, matrix-ios-sdk, matrix-ios-kit, matrix-android-sdk and synapse itself. There's been a tonne of changes here since June, but the main highlights are:

  • End-to-end encryption support landed in matrix-js-sdk and matrix-react-sdk (and thus Riot/Web (now Element)) and in dev on matrix-ios-sdk and matrix-android-sdk using the Olm and Megolm ratchets. More about this later.
  • Hosted integrations, bots and bridges! More about this later too.
  • Direct Message UI landed in Riot/Web (now Element) to tag rooms which exist for contacting a specific user. These get grouped now as the 'People' list in Riot (now Element). It's in dev on Riot/iOS (now Element (iOS)) & Android (now Element Android).
  • Entirely new UI for starting conversations with people - no more creating a room and then inviting; you just say "i want to talk with Bob".
  • Entirely new UI for inviting people into a room - no more confusion between searching the membership list and inviting users.
  • FilePanel UI in Riot/Web (now Element) to instantly view all the attachments posted in a room
  • NotificationPanel UI in Riot/Web (now Element) to instantly view all your missed notifications and mentions in a single place
  • "Volume control" UI to have finer grained control over per-room notification noisiness
  • Entirely re-worked Room Directory navigator - lazy-loading the directory from the server, and selecting rooms via bridge and remote server
  • It's very exciting to see a wider audience discovering Matrix through Riot (now Element) - and Riot (now Element)'s usage stats have been growing steadily since launch, but there's still a lot of room for improvement.
Stuff on the horizon includes:
  • Formal beta-testing the full end-to-end encryption feature-set.
  • Performance and optimisation work on all platforms - there are huge improvements to be had.
  • Long-awaited poweruser features: 'dark' colour scheme; more whitespace-efficient layout; collapsing consecutive joins/parts...
  • "Landing page" to help explain what's going on to new users and to show deployment-specific announcements and room lists.
  • Support for arbitrary profile information.
  • Threading.
Riot (now Element) releases are announced on #riot:matrix.org, the Riot blog and Twitter - keep your eyes peeled for updates!

End to End Encryption

Full cross-platform end-to-end encryption is incredibly close now, with the develop branches of iOS & Android SDKs and Riot (now Element) currently in internal testing as of Nov 7 - expect a Big Announcement very shortly.  We're very optimistic based on how the initial implementation on Riot/Web (now Element)has been behaving so far.

When E2E first landed on Riot/Web (now Element) in September we were missing mobile support, encrypted attachments, encrypted VoIP signalling, and the ability to retrieve encrypted history on new devices - as well as a formal audit of the underlying Olm and Megolm libraries. Since then things have progressed enormously with most of the core team working since September on filling in the gaps, as well as getting audited and fixing all the weird and wonderful edge cases that the audit showed up. All the missing stuff has been landing on the develop branches over the last few weeks, with encrypted attachments landing on web on Nov 10; encrypted VoIP landing on Nov 11; etc. Watch this space for news on the upcoming cross-platform public beta!

Hosted Integrations and introducing go-neb

One of the new features which arrived in Riot (now Element) is the ability to add "single click" integrations (i.e. bots, bridges, application services) into rooms from Riot/Web (now Element) by clicking the "Manage Integrations" button in Room Settings. These integrations are hosted for free by Riot (now Element) in its production infrastructure (codenamed Scalar), but all the actual bots/bridges/services themselves are normal opensource Matrix apps and you can of course run them yourself too.

screen-shot-2016-11-12-at-11-47-29

The Bot integrations are all provided by go-neb - a complete rewrite in Golang and general reimagining of the old python NEB bot which old-timers will recall as the very first bot written for the Matrix ecosystem. Go-neb has effectively now become a general purpose golang bot/integration framework for Matrix, with the various different services implemented as plugins for Github, JIRA, Giphy, Guggy etc. Critically it supports authenticating Matrix users through to the remote service, letting normal Matrix users interact with Github and friends using their actual Github identity rather than via a bot user - this is a huge huge improvement over the original naive python NEB.

If you like Go and you like Matrix, we'd strongly suggest having a go (hah) at adding new services into go-neb: anything implemented against go-neb will also magically be hosted and available as part of the "Manage Integrations" interface in Riot (now Element), as well as being available to anyone else running their own go-nebs. For full details of the architecture and how to implement new plugins, go check out the full README.

If Matrix is to provide a good FOSS alternative to systems like Slack it's critical to have a large array of available integrations, so we really hope that the community will help us grow the list!

Building Bridges

There have been vast improvements to bridging over the last few months, including the ability to "plumb" bridges into arbitrary rooms (letting you link a single Matrix room through to multiple remote networks). Like go-neb, Riot (now Element) is providing free bridge hosting with the ability to add to rooms with a "single click" via the Manage Integrations button in Room Settings. For now, Riot (now Element) is hosting any bridges built on the matrix-appservice-bridge codebase.

In short, this means that any user can go and take an existing Matrix room and link it through to Slack, IRC, Gitter, and more.

matrix-appservice-irc

Huge amounts of work have gone into improving the IRC bridge - both adding new features to try to give the most IRC-friendly experience when bridging into IRC, as well as lots of maintenance and performance work to ensure that the matrix.org hosted bridges can scale to the large amounts of traffic we're seeing going through Freenode and others. We've also added hosted bridges for OFTC and Snoonet, and turned on connecting via IPv6 by default for networks which support it.

You can read the full changelogs for 0.5.0 and 0.6.0 at https://github.com/matrix-org/matrix-appservice-irc/blob/master/CHANGELOG.md, but the main highlights are:

matrix-appservice-irc 0.6.0

  • Debouncing quits and netsplits: if IRC users quit there can be a window where they are shown as just offline rather than leaving the room, avoiding join/part spam and creating unnecessary state events in Matrix.
  • Topic bridging: IRC topics can now be bridged to Matrix!
  • Support custom SSL CAs (thanks to @Waldteufel)
  • Support custom media repository URLs
  • Support the ability to quit your IRC user from the network entirely
  • Fix rate limiting for traffic from privileged IRC users and services
matrix-appservice-irc 0.5.0:
  • Support throttling reconnections to IRC servers to avoid triggering abuse thresholds
  • Support "Third party lookup": mapping from IRC users & rooms into Matrix IDs for discovery purposes
  • Support rate-limiting membership entries to avoid triggering abuse thresholds
  • Require permission of an IRC chanop when plumbing an IRC channel into a Matrix room
  • Prevent routing loops by advisory m.room.bridging events
  • Better error messages
  • Sync chanmode +s correctly
  • Fix IPv6 support
Next up is automating NickServ login, and generally continuing to make the IRC experience as good as we possibly can.

matrix-appservice-slack

Similarly, the Slack bridge has had loads of work. The main changes include:

  • Ability to dynamically bridge ("plumb") rooms on request
  • Add Prometheus monitoring metrics
  • Ability to discover slack team tokens via OAuth2
  • Sync avatars both ways
We're currently looking at shifting over to Slack's RTM (Real Time Messaging) API rather than using webhooks in order to get an even better fit with Slack and support bridging DMs, but the current setup is still very usable. For more details: https://github.com/matrix-org/matrix-appservice-slack.

matrix-appservice-gitter

The Gitter bridge has provided a lot of inspiration for the more recent work on the Slack bridge. Right now it provides straightforward bridging into Gitter rooms, albeit proxied via a 'matrixbot' user on the Gitter side. We're currently looking at letting also users authenticate using their Gitter credentials so they are bridged through to their 'real' Gitter user - watch this space. For more details: https://github.com/matrix-org/matrix-appservice-gitter.

Community updates

matrix-ircd

matrix-ircd is a rewrite of the old PTO project (pto.im): a Rust application that turns Matrix into a single great big decentralised IRC network. PTO itself has unfortunately been on hiatus and is rather bitrotted, so Erik from the core Matrix Team picked it up to see if it could be resurrected. This ended up turning into a complete rewrite (switching from mio to tokio etc), and the new project can be found at https://github.com/matrix-org/matrix-ircd.

matrix-ircd really is an incredibly promising way of getting folks onto Matrix, as it exposes the entirety of Matrix as a virtual IRC network. This means that IRC addicts can jack straight into Matrix, talking native IRC from their existing IRC clients - but interacting directly with Matrix rooms as if they were IRC channesls without going through a bridge. Obviously you lose all of the features and semantics which Matrix provides beyond IRC, but this is still a great way to get started.

The project is currently alpha but provides a good functioning base to extend, and Erik's explicitly asking for help from the Rust and Matrix community to fill in all the missing features. If you're interested in helping, please come talk on #matrix-ircd:matrix.org!.

matrix-appservice-gitter-twisted

Not to be confused with the Node-based matrix-appservice-gitter, matrix-appservice-gitter-twisted is an entirely separate project written in Python/Twisted by Remram (Remi Rampin) that has the opposite architecture: rather than bridging existing rooms into Matrix, matrix-appservice-gitter-twisted lets you provide your Gitter credentials and acts instead as a Gitter client, bridging your personal view of a Gitter room into a private Matrix room just for you.

This obviously has some major advantages (your actions on Gitter use your real Gitter account rather than a bot), and some disadvantages too (you can't use Matrix features when interacting with other Matrix users in the same room, and the Gitter channel itself is not decentralised into Matrix). However, it's a really cool example of how the other model can work - and within the core team, we've been arguing back and forth for ages now on whether normal bridges or "sidecar" bridges like this one are a more preferable architecture. Thanks to Remram's work we can try both side by side! Go check it out at https://github.com/remram44/matrix-appservice-gitter-twisted.

telematrix

Telematrix is Telegram<->Matrix bridge, written by Sijmen Schoon using python3 and asyncio. Right now it's a fairly early alpha hardcoded to bridge a specific Telegram channel into a specific Matrix room, but it works and in use and could be an excellent base for folks interested in a more comprehensive Matrix/Telegram bridge. Go check it out at https://github.com/SijmenSchoon/telematrix telematrix

Ruma

Meanwhile, the Ruma project to write a Matrix homeserver in Rust has been progressing steadily, with more and more checkboxes appearing on the status page, with significant new contributions from mujx and farodin91. The best way to keep track of Ruma is to read Jimmy's excellent This Week in Ruma updates and of course hang out on #ruma:matrix.org.

NaChat

An entirely new client on the block since the last update is NaChat, written by Ralith. NaChat is a pure cross-platform Qt/C++ desktop client written from the ground up, supporting local history synchronisation, excellent performance, native Qt theming, and generally being a lean and mean Matrix client machine. It's still alpha, but it's easy to build and a lot of fun to play with.

screen-shot-2016-11-12-at-12-01-03

Please give a spin, encourage Ralith to finish the timeline-view-rewrite branch (which is probably the one you want to be running!), and come hang out on #nachat:matrix.org.

Quaternion

Meanwhile, the Quaternion Qt/QML desktop client and its libqmatrixclient library has been making sure and steady progress, with fxrh, kitsune, maralorn and others working away at it. The difference with NaChat here is using QML rather than native Qt widgets, and a focus on more advanced UX features like a custom infinite-scrolling scrollbar widget, unread message notifications, and read-up-to markers.  Recent developments include the first official release (0.0.1) on Sept 12, official Windows builds, lots of work on implementing better Read-up-to Markers, scrolling behaviour etc. Again, it's worth keeping a checkout of Quaternion handy and playing with the client - it's loads of fun!

screen-shot-2016-11-12-at-12-12-48

Google Summer of Code 2016 Retrospective

The summer is long gone now, and along with it Google Summer of Code. This was the first year we've participated in GSoC, and it was an incredible experience - both judging all the applications, and then working with Aviral Dasgupta and Will Hunt (Half-Shot) who joined the core team as part of their GSoC endeavours.

Aviral's work has been widespread throughout Riot (now Element): adding consistent Emoji support throughout the app via Emoji One, implementing the beta Rich Text Editor (RTE) and all-new autocompletion UI, as well as a bunch of spec proposals for rich message semantics and an initial Slack Webhooks application service. You can read his wrap up at http://www.aviraldg.com/p/gsoc-2016-wrapup and use the code in Riot/Web (now Element) today. We're currently working on fixing the final issues on RTE and auto-complete and hope to enable them by default real soon now!

Meanwhile, Half-Shot's work ended up focusing on bridging through to Twitter and working on the Threading spec proposal for Matrix. You can find out all about the Twitter bridge at https://half-shot.github.io/matrix-appservice-twitter; it works incredibly well (arguably too well, given the amount of traffic it can bridge into Matrix! :S) - and we're currently working on hosting a version of it on matrix.org for all your tweeting needs. You can see Half-Shot's wrapup blog post over at https://half-shot.uk/gsoc16_evaulation.

Finally, as a bit of a wildcard, we discovered the other day that there was also another GSoC project using Matrix by Waqee Khalid, supported by the Berkman Center for Internet and Society at Harvard to switch Apache Wave (formerly Google Wave) over to using Matrix rather than XMPP for federation!  The implementation looks a little curious here, as Wave used XMPP as a blunt pubsub layer for synchronising protobuf deltas - and it looks like this implementation uses Matrix similarly, thus killing any interop with the rest of Matrix, which is a bit of a shame.  If anyone knows more about the project we'd love to hear though!

Either way, it's been a pleasure to work with the GSoC community and we owe Aviral and Half-Shot (and Waqee!) a huge debt of gratitude for spending their summers (and more!) hacking away improving Matrix. So, thanks Google for making GSoC possible and thanks to the GSoCers for all their contributions, effort & enthusiasm! Watch this space for updates on RTE, new-autocomplete and the twitter bridge going live...

Matrix in the news

Just in case you missed them, there have been a couple of high profile articles flying around about Matrix recently - we made the front cover of Linux Magazine in August with a comprehensive review of Matrix and Vector (now Riot (now Element)). Then when we launched Riot (now Element) itself we got a cautiously positive write-up from Mike Butcher at Techcrunch. We also wrote an guest column for Techcrunch about the importance of bringing power back to the people via decentralisation, which got a surprising amount of attention on HackerNews and elsewhere.

More recently, we were lucky enough to get an indepth video interview with Bryan Lunduke as part of his 'Linux & Whatnot' series, and also a write-up in NetworkWorld alongside Signal & Wire as part of Bryan's journeys in the land of encrypted messaging.

screen-shot-2016-11-12-at-12-31-34

Huge thanks to everyone who's been nice enough to spread the word of Matrix!

Matrix In Real Life

Finally, we've been present at a slew of different events. In August we attended FOSSCON again in Philadelphia to give a general update on Matrix to the Freenode community...

It's @fossconNE time! Dave will be talking about Matrix at 1pm today. Come & say hi! pic.twitter.com/KtfTVRnAVn

— Matrix (@matrixdotorg) August 20, 2016

...and then Riot (now Element) was launched at Monage in Boston in September, with Matthew and Amandine respectively presenting Matrix and Riot (now Element):

Best #swag of #MoNage? The @RiotChat stand is getting mobbed :) pic.twitter.com/NltlfO74Y9

— Oisin Lunny (@oisinlunny) September 20, 2016

Whilst quite a small event, the quality of folks present was incredibly high - much fun was had comparing open communities to walled gardens with Nicola Greco from Tim Berners-Lee's Solid project...

.@ara4n & @AmandineLePape showing off the new Riot (now Element) at #MoNage in #Boston! pic.twitter.com/U0qSNjNLGs

— Riot (@RiotChat) September 21, 2016

...comparing notes with the founders of ICQ, hanging out with Alan from Wire...

A meeting of the messaging minds! @Wire and @matrixdotorg federating over a pint at #MoNage. pic.twitter.com/uTUvWrKRqp

— Oisin Lunny (@oisinlunny) September 21, 2016

...chatting to FireChat's CTO, catching up with Dan York from the Internet Society, etc.

Then in October we spoke about scaling Python/Twisted for Matrix at PyCon France in Rennes - this was really fun, albeit slightly embarrassing to be the only talk about Python/Twisted in a track otherwise entirely about Python 3 and asyncio :D That said, the talk seemed to be well received and it was fantastic to meet some of the enthusiastic French Python community and see folks in the audience who were already up and running on Matrix!

Lots of fun at @pyconfr today demoing Matrix, including a quick video conference with the audience & #TADHack London! pic.twitter.com/rwbA43X7wB

— Matrix (@matrixdotorg) October 15, 2016

The same weekend also featured TADHack Global - we were present at the London site; you can read all about it in our earlier blog post. There was a really high standard of hacks on Matrix this year, and it was incredibly hard to judge the hackathon. In most ways this is a good problem to have though!

Dramatic prep for @maffydub and yinyee's Matrix IoT demo with multiple ESP8266, proximity sensor, and @tropo ASR!! pic.twitter.com/eytG8QWFq6

— Matrix (@matrixdotorg) October 16, 2016

Meanwhile, coming up on the horizon we have TADSummit in Lisbon next week, where we'll be giving an update on Matrix to the global Telco Application Developer community, and then the week after we'll be in Israel as part of the Geektime Techfest, Devfest and Chatbot Summit. So if you're in Lisbon or Tel Aviv do give us a ping on Matrix and come hang out!

Matrixing for fun and profit!

If you've read this far, we're guessing you're hopefully quite interested in Matrix (or just skipping to the end ;).  Something we don't talk about as much as we should is that if you're interested in being paid to work on Matrix full time, we're always interested in expanding the core team.  Right now we're particularly looking for:

  • Experienced front-end developers who can help build the next generation of matrix-react-sdk and vector-web
  • Professional tech-writers to help keep The Spec and tutorials and other docs updated and as kick-ass as possible
  • Backend Python/Twisted or Golang wizards to help us improve and evolve Synapse
  • Mobile developers (especially Android) to help keep the mobile SDKs and apps evolving as quickly as possible
  • Integration fiends who'd like to be paid to build more bridges, bots and other integrations for the overall ecosystem!
Most of the core team hangs out in London or Rennes (France), but we're also open to remote folks where it makes sense.  If this sounds interesting, please shoot us a mail to jobs@matrix.org.  Obviously it helps enormously if we already know you from the Matrix community, and you have a proven FOSS track record.

Conclusion

Apologies once again for an overdue and overlong update, but hopefully this gives a good taste of how Matrix is progressing. Just to give a different datapoint: this graph is quite interesting - showing the volume of events per day sent by native (i.e. non-bridged) Matrix users visible to the matrix.org homeserver since we turned the service on back in 2014:

screen-shot-2016-11-04-at-11-02-58-1

As you can see, things are accelerating quite nicely - and this is ignoring all the traffic in the rest of the Matrix ecosystem that happens not to be federated onto the matrix.org HS, not to mention the huge amounts of traffic due to bridging.

Our plans over the next few months are going to involve:

  • Turning on end-to-end encryption by default for any rooms with private history - whilst ensuring it's as easy to write Matrix clients, bots and bridges as it ever was.
  • Yet more scalability and performance work across the board, to ensure Synapse and the client SDKs can handle the growth curve we're seeing here
  • Releasing 0.3.0 of the matrix spec itself.
  • Making Riot (now Element)'s UX excellent.
  • Editable messages.
  • Threading.
  • User groups, for applying permissions/invites etc to groups of users as well as individuals.
  • Formalising the federation spec at last
  • As many bots, bridges and other integrations as possible!
  • Making VoIP/Video conferencing and calling awesome.
  • More experiments with next-generation homeservers
  • Starting to really think hard about decentralised identity and reputation/spam management
  • ...and a few new things we don't want to talk about yet ;)
If you've got this far - congratulations! Thanks for reading, and thank you for supporting the Matrix ecosystem.

Now more than ever before we believe that it is absolutely critical to have a healthy and secure decentralised communications ecosystem on the 'net (whether that's Matrix, XMPP, Tox or whatever) - so thank you again for participating in our one :)  And if you don't already run your server, please grab a Synapse and have fun!

  • Matthew, Amandine & the Matrix Team.

The Matrix Summer Special!!

04.07.2016 00:00 — General Matthew Hodgson

Hi folks - another few months have gone by and once again the core Matrix team has ended up too busy hacking away on the final missing pieces of the Matrix jigsaw puzzle to have been properly updating the blog; sorry about this. The end is in sight for the current crunch however, and we expect to return to regular blog updates shortly! Meanwhile, rather than letting news stack up any further, here's a quick(?) attempt to summarise all the things which have been going on!

Synapse 0.16.1 released!

This one's a biggy: in the mad rush during June to support the public debut for Vector, we made a series of major Synapse releases which apparently we forgot to tell anyone about (sorry!). The full changelog is at the bottom of the post as it's huge, but the big features are:

  • Huge performance improvements, including adding write-thru event caches and improving caching throughout, and massive improvements to the speed of the room directory API.
  • Add support for inline URL previewing!
  • Add email notifications!
  • Add support for LDAP authentication (thanks to Christoph Witzany)
  • Add support for JWT authentication (thanks to Niklas Riekenbrauck)
  • Add basic server-side ignore functionality and abuse reporting API
  • Add ability to delegate /publicRooms API requests to a list of secondary homeservers
  • Lots and lots and lots of bug fixes.
If you haven't upgraded, please do asap from https://github.com/matrix-org/synapse!

There's also been a huge amount of work going on behind the scenes on horizontal scalability for Synapse.  We haven't drawn much attention to this yet (or documented it) as it's still quite experimental and in flux, but the main change is to add the concept of application-layer replication to Synapse - letting you split the codebase into multiple endpoints which can then be run in parallel, each replicating their state off the master synapse process.  For instance, right now the Matrix.org homeserver is actually running off three different processes: the main synapse; another specific to calculating push notifications and another specific to serving up the /sync endpoint.  These three are then abstracted behind the dendron layer (which also implements the /login endpoint). The idea is that one can then run multiple instances of the /sync and pusher (and other future) endpoints to horizontally scale.  For now, they share a single database writer, but in practice this has improved our scalability and performance on the Matrix.org HS radically.

In future we'll actually document how to run these, as well as making it easy to spin up multiple concurrent instances - in the interim if you find you're hitting performance limits running high-traffic synapses come talk to us about it on #matrix-dev:matrix.org.  And the longer term plan continues to be to switch out these python endpoint implementations in future for more efficient implementations.  For instance, there's a golang implementation of the media repository currently in development which could run as another endpoint cluster.

Vector released!

Much has been written about this elsewhere, but Web, iOS and Android versions of the Vector clients were finally released to the general public on June 9th at the Decentralised Web Summit in San Francisco.  Vector is a relatively thin layer on top of the matrix-react-sdk, matrix-ios-sdk and matrix-android-sdk Matrix.org client SDKs which showcases Matrix's collaboration and messaging capabilities in a mass-market usable app.  There's been huge amounts of work here across the SDKs for the 3 platforms, with literally thousands of issues resolved.  You can find the full SDK changelogs on github for React, iOS and Android.  Some of the more interesting recent additions to Vector include improved room notifications, URL previews, configurable email notifications, and huge amounts of performance stability work.

Future work on Vector is focused on showcasing end-to-end encryption, providing a one-click interface for adding bots/integrations & bridges to a room, and generally enormously improving the UX and polish.  Meanwhile, there's an F-Droid release for Android landing any day now.

If you haven't checked it out recently, it's really worth a look :)

Vector

Matrix Spec 0.1.0

In case you didn't notice, we also released v0.1.0 of the Matrix spec itself in May - this is a fairly minor update which improves the layout of the document somewhat (thanks to a PR from Jimmy Cuadra) and a some bugfixes.  You can see the full changelog here. We're overdue a new release since then (albeit again with relatively minor changes).

Google Summer of Code

We're in the middle of the second half of GSoC right now, with our GSoC students Aviral and Half-Shot hacking away on Vector and Microblogging projects respectively.  There's a lot of exciting stuff coming out of this - Aviral contributing Rich Text Editing, Emoji autocompletion, DuckDuckGo and other features into Vector (currently on branches, but will be released soon) and Half-Shot building a Twitter bridge as part of his Matrix-powered microblogging system.  Watch this space for updates!

Ruma

Lots of exciting stuff has been happening recently over at Ruma.io - an independent Matrix homeserver implementation written in Rust.  Over the last few weeks Jimmy and friends have got into the real meat of implementing events and the core of the Matrix CS API, and as of the time of writing they're the topmost link on HackerNews!  There's a lot of work involved in writing a homeserver, but Ruma is looking incredibly promising and the feedback from their team has been incredibly helpful in keeping us honest on the Matrix spec and ensuring that it's fit for purpose for 3rd party server implementers.

Also, Ruma just released some truly excellent documentation as a high-level introduction to Matrix (thanks to Leah and Jimmy) - much better than anything we have on the official Matrix.org site.  Go check it out if you haven't already!

End to End Encryption

There has been LOADS of work happening on End to End encryption: finalising the core 1:1 "Olm" cryptographic ratchet; implementing the group "Megolm" ratchet (which shares a single ratchet over all the participants of a room for scalability); fully hooking Olm into matrix-js-sdk and Vector-web at last, and preparing for a formal and published-to-the-public 3rd party security audit on Olm which will be happening during July.

This deserves a post in its own right, but the key thing to know is that Olm is almost ready - and indeed the work-in-progress E2E UX is even available on the develop branch of vector if you enable E2E in the new 'Labs' section in User Settings.  Olm itself is usable only for 'burn after reading' strictly PFS messages, but Megolm integration with Vector & Synapse will follow shortly afterwards which will finally provide the E2E nirvana we've all been waiting for :)

Decentralised Web Summit

Matrix had a major presence as a sponsor at the first ever Decentralised Web Summit hosted by the Internet Archive in San Francisco back in June.  This was a truly incredible event - with folks gathering from across the world to discuss, collaborate and debate on ensure that the web is not fragmented or trapped into proprietary silos - with the likes of Tim Berners-Lee, Vint Cerf and Brewster Kahle in attendance.  We ran a long 2 hour workshop on Matrix and showed off Vector to anyone and everyone - and meanwhile the organisers were kind enough to promote Matrix as the main decentralised chat interface for the conference itself (bridged with their Slack).  A full writeup of the conference really merits a blog post in its own right, but the punchline is that you could genuinely tell that this is the beginning of a new era of the internet - whether it's using Merkle DAGs (like Matrix) or Blockchain or similar technologies: we are about to see a major shift in the balance of power on the internet back towards its users.

We strongly recommend checking out the videos which have all been published at Decentralised Web Summit, including lightning talks introducing both Matrix and Vector, and digging into as many of the projects advertised as possible.  It was particularly interesting for us to get to know Tim Berners-Lee's latest project at MIT: Solid - which shares quite a lot of the same goals as Matrix, and subsequently seeing Tim pop up on Matrix via Vector.  We're really looking forward to working out how Matrix & Solid can complement each other in future.

Matthew, Tim Berners-Lee and Matrix

Matrix.to

Not the most exciting thing ever, but heads up that there's a simple site up at https://matrix.to to provide a way of doing client-agnostic links to content in Matrix.  For instance, rather than linking specifically into an app like Vector, you can now say https://matrix.to/#/#matrix:matrix.org to go there via whatever app you choose.  This is basically a bootstrapping process towards having proper mx:// URLs in circulation, but given mx:// doesn't exist yet, https://matrix.to hopefully provides a useful step in the right direction :)

PRs very welcome at https://github.com/matrix-org/matrix.to.

Bridges and Bots

Much of the promise of Matrix is the ability to bridge through to other silos, and we've been gradually adding more and more bridging capabilities in.

For instance, the IRC bridge has had a complete overhaul to add in huge numbers of new features and finally deployed for Freenode a few weeks ago:

New Features:

  • Nicks set via !nick will now be preserved across bridge restarts.
  • EXPERIMENTAL: IRC clients created by the bridge can be assigned their own IPv6 address.
  • The bridge will now send connection status information to real Matrix users via the admin room (the same room !nickcommands are issued).
  • Added !help.
  • The bridge will now fallback to body if the HTML content contains any unrecognised tags. This makes passing Markdown from Matrix to IRC much nicer.
  • The bridge will now send more metrics to the statsd server, including the join/part rate to and from IRC.
  • The config option matrixClients.displayName is now implemented.
Bug fixes:
  • Escape HTML entities when sending from IRC to Matrix. This prevents munging occurring between IRC formatting and textual < element > references, whereby if you sent a tag and some colour codes from IRC it would not escape the tag and therefore send invalid HTML to Matrix.
  • User IDs starting with - are temporarily filtered out from being bridged.
  • Deterministically generate the configuration file.
  • Recognise more IRC error codes as non-fatal to avoid IRC clients reconnecting unnecessarily.
  • Add a 10 second timeout to join events injected via the MemberListSyncer to avoid HOL blocking.
  • 'Frontier' Matrix users will be forcibly joined to IRC channels even if membership list syncing I->M is disabled. This ensures that there is always a Matrix user in the channel being bridged to avoid losing traffic.
  • Cache the /initialSync request to avoid hitting this endpoint more than once, as it may be very slow.
  • Indexes have been added to the NeDB .db files to improve lookup times.
  • Do not recheck if the bridge bot should part the channel if a virtual user leaves the channel: we know it shouldn't.
  • Refine what counts as a "request" for metrics, reducing the amount of double-counting as requests echo back from the remote side.
  • Fixed a bug which caused users to be provisioned off their user_id even if they had a display name set.
Meanwhile, a Gitter bridge is in active development (and in testing with the Neovim community on Gitter/Matrix/Freenode), although lacking documentation so far.

Finally, NEB - the Matrix.org bot framework is currently being ported from Python to Golang to act as a general Go SDK for rapidly implementing new bot capabilities.

There's little point in all of the effort going into bridges and bots if it's too hard for normal users to deploy them, so on the Vector side of things there's an ongoing project to build a commercial-grade bot/bridge hosted service offering for Matrix which should make it much easier for non-sysadmins to quickly add their own bots and bridges into their rooms.  There's nothing to see yet, but we'll be yelling about it when it's ready!

Conclusion

I'm sure there's a lot of stuff missing from the quick summary above - suffice it to say that the Matrix ecosystem is growing so fast and so large that it's pretty hard to keep track of everything that's going on.  The big remaining blockers we see at this point are:

  • End-to-end Encryption roll-out
  • Polishing UX on Vector - showing that it's possible to build better-than-Slack quality UX on top of Matrix
  • Bots, Integrations and Bridges - making them absolutely trivial to build and deploy, and encouraging everyone to write as many as they can!
  • Improving VoIP, especially for conferencing, especially on Mobile
  • Threading
  • Editable messages
  • Synapse scaling and stability - this is massively improved, but there's still work to be done.  Meanwhile projects like Ruma give us hope for light at the end of the Synapse tunnel!
  • Spec refinements - there are still a lot of open spec bugs which we need to resolve so we can declare the spec (and thus Matrix!) out of beta.
  • More clients - especially desktop ones; helping out with Quaternion, Tensor, PTO, etc.
...and then all the pieces of the jigsaw will finally be in place, and Matrix should hopefully fulfil its potential as an invaluable, open and decentralised data fabric for the web.

Thanks, once again, to everyone who's been supporting and using Matrix - whether it's by hanging out in the public chatrooms, running your own server, writing your own clients, bots, or servers, or just telling your friends about the project.  The end of the beginning is in sight: thanks for believing in us, and thank you for flying Matrix.

Matthew, Amandine & the Matrix Team.

Appendix: The Missing Synapse Changelogs

Changes in synapse v0.16.1 (2016-06-20)

Bug fixes:

  • Fix assorted bugs in /preview_url (PR #872)
  • Fix TypeError when setting unicode passwords (PR #873)
Performance improvements:
  • Turn use_frozen_events off by default (PR #877)
  • Disable responding with canonical json for federation (PR #878)

Changes in synapse v0.16.1-rc1 (2016-06-15)

Features: None

Changes:

  • Log requester for /publicRoom endpoints when possible (PR #856)
  • 502 on /thumbnail when can't connect to remote server (PR #862)
  • Linearize fetching of gaps on incoming events (PR #871)
Bugs fixes:
  • Fix bug where rooms where marked as published by default (PR #857)
  • Fix bug where joining room with an event with invalid sender (PR #868)
  • Fix bug where backfilled events were sent down sync streams (PR #869)
  • Fix bug where outgoing connections could wedge indefinitely, causing push notifications to be unreliable (PR #870)
Performance improvements:
  • Improve /publicRooms performance (PR #859)

Changes in synapse v0.16.0 (2016-06-09)

NB: As of v0.14 all AS config files must have an ID field.

Bug fixes:

  • Don't make rooms published by default (PR #857)

Changes in synapse v0.16.0-rc2 (2016-06-08)

Features:

  • Add configuration option for tuning GC via gc.set_threshold (PR #849)
Changes:
  • Record metrics about GC (PR #771, #847, #852)
  • Add metric counter for number of persisted events (PR #841)
Bug fixes:
  • Fix 'From' header in email notifications (PR #843)
  • Fix presence where timeouts were not being fired for the first 8h after restarts (PR #842)
  • Fix bug where synapse sent malformed transactions to AS's when retrying transactions (Commits310197b, 8437906)
Performance Improvements:
  • Remove event fetching from DB threads (PR #835)
  • Change the way we cache events (PR #836)
  • Add events to cache when we persist them (PR #840)

Changes in synapse v0.16.0-rc1 (2016-06-03)

Version 0.15 was not released. See v0.15.0-rc1 below for additional changes.

Features:

  • Add email notifications for missed messages (PR #759, #786, #799, #810, #815, #821)
  • Add a url_preview_ip_range_whitelist config param (PR #760)
  • Add /report endpoint (PR #762)
  • Add basic ignore user API (PR #763)
  • Add an openidish mechanism for proving that you own a given user_id (PR #765)
  • Allow clients to specify a server_name to avoid 'No known servers' (PR #794)
  • Add secondary_directory_servers option to fetch room list from other servers (PR #808, #813)
Changes:
  • Report per request metrics for all of the things using request_handler (PR #756)
  • Correctly handle NULL password hashes from the database (PR #775)
  • Allow receipts for events we haven't seen in the db (PR #784)
  • Make synctl read a cache factor from config file (PR #785)
  • Increment badge count per missed convo, not per msg (PR #793)
  • Special case m.room.third_party_invite event auth to match invites (PR #814)
Bug fixes:
  • Fix typo in event_auth servlet path (PR #757)
  • Fix password reset (PR #758)
Performance improvements:
  • Reduce database inserts when sending transactions (PR #767)
  • Queue events by room for persistence (PR #768)
  • Add cache to get_user_by_id (PR #772)
  • Add and use get_domain_from_id (PR #773)
  • Use tree cache for get_linearized_receipts_for_room (PR #779)
  • Remove unused indices (PR #782)
  • Add caches to bulk_get_push_rules* (PR #804)
  • Cache get_event_reference_hashes (PR #806)
  • Add get_users_with_read_receipts_in_room cache (PR #809)
  • Use state to calculate get_users_in_room (PR #811)
  • Load push rules in storage layer so that they get cached (PR #825)
  • Make get_joined_hosts_for_room use get_users_in_room (PR #828)
  • Poke notifier on next reactor tick (PR #829)
  • Change CacheMetrics to be quicker (PR #830)

Changes in synapse v0.15.0-rc1 (2016-04-26)

Features:

  • Add login support for Javascript Web Tokens, thanks to Niklas Riekenbrauck (PR #671,#687)
  • Add URL previewing support (PR #688)
  • Add login support for LDAP, thanks to Christoph Witzany (PR #701)
  • Add GET endpoint for pushers (PR #716)
Changes:
  • Never notify for member events (PR #667)
  • Deduplicate identical /sync requests (PR #668)
  • Require user to have left room to forget room (PR #673)
  • Use DNS cache if within TTL (PR #677)
  • Let users see their own leave events (PR #699)
  • Deduplicate membership changes (PR #700)
  • Increase performance of pusher code (PR #705)
  • Respond with error status 504 if failed to talk to remote server (PR #731)
  • Increase search performance on postgres (PR #745)
Bug fixes:
  • Fix bug where disabling all notifications still resulted in push (PR #678)
  • Fix bug where users couldn't reject remote invites if remote refused (PR #691)
  • Fix bug where synapse attempted to backfill from itself (PR #693)
  • Fix bug where profile information was not correctly added when joining remote rooms (PR #703)
  • Fix bug where register API required incorrect key name for AS registration (PR #727)