Sunday, June 07, 2009

Architecture, Mobiles, and Health: 10 pitfalls

The “eHealth” space (which obviously includes the mobile, mHealth aspects), is a bit too chaotic from the perspective of a common developing country. Imagine you are responsible for ICT (Information and Communication Technology) of a ministry of health or hospital wanting to modernize to improve patient outcomes or disease detection. Where do you start? What could work and what won’t, for you? What is reliable? What is the fine print?

Unfortunately, this is not just because of a rapid pace of innovation in technology, or the extreme conditions in which these health solutions have to exist.

Some of the confusion is –unwillingly- created and perpetuated by the same organizations that are trying to help in the space. This includes international organizations, academia, NGOs, funders, open technology groups, private tech vendors, etc. Types of issues I’ve run into first-hand include:

  • Academic projects that collect data with preference towards information that will help to publish a paper rather than the information that will be the most actionable or help community health the most. The project rarely fits in with other technologies already deployed.
  • Funders that sponsor the construction of  specialized, one-off, disease-specific systems, that are built from scratch even if architecturally they are the same as other specialized, one-off, disease-specific projects.
  • Technology vendors fostering ‘data sharing’ projects where the data ends up shared but, unfortunately, ‘owned’ by the vendor.
  • Open technology projects that would rather accrete features or add cool gizmos that attract users into a do-it-all system rather than open up information and let the data flow around to other applications.
  • Groups that would rather implement anything new, now, regardless of what already works, than to help a developing country figure out what they really need.

Some of organizations are fortunately waking up to these issues and starting initiatives to reduce their occurrence. A key component of these initiatives is to bringing in an architectural approach to the evaluation, planning, implementation and assessment of ICT needs. And fortunately these organizations have people that both know the problem space and have worked as architects in other contexts.

By an ‘architectural approach’ I mean an approach that:

  • Separates the discussions of capability from implementation. e.g. a medical record system is a capability a hospital needs, OpenMRS or OpenVISTA are two implementation alternatives that could fulfill that need.
  • Understands the role of standards in supporting interoperable building blocks that can evolve over time, not as an end in of itself.
  • Helps transition the end goals, requirements and capabilities of the overall health system  - the ‘business’ architecture - into ‘technology’, ‘integration’ and ‘infrastructure’ architectures that only exist to support the end goals.
  • Navigates the tension between the potential benefits of centralized, top-down decision making around ICT versus the potential benefits of decentralized, bottoms-up decision making.

What would it look like from the perspective of an implementer if the eHealth/mHealth community took such an approach? Here are some things you could imagine:

  • You would get something like a capability map, a set of boxes with labels and lines that describe common elements of an eHealth countrywide health information system (HIS), including capabilities such as medical records, biosurveillance, pharmacy stock management, etc.
  • You would be able to write on this map which capabilities you have implemented (digitally or not), and for each capability get some performance metrics that can help you rank its effectiveness. For example, a biosurveillance component would assess the timeliness and completeness of reports. Your capability maps would help you do an assessment against this metrics, letting you see your maturity, and your weak spots. This assessment by itself is a huge asset for a country and funders, as it lets you understand the landscape before you aim your efforts.
  • Using the same taxonomy of capabilities, a technology team should be able to find open source solutions, papers, and case studies that describe if/how the capability can be improved. Ideally, these case studies should roll up to a community-maintained pattern library, that describes the distilled “solutions to a problem in a context” that have been discovered previously.
  • Any improvements can be measured over time and pilots can be assessed objectively as to how much they contribute to the goals of the country (currently, organizations running pilots set up their own measures and they aren’t always traceable to the measures a host country cares about).
  • Funders could work together helping implement solutions that work together and not on a per-project, per-disease basis.
  • Finally, any local innovations could be tracked and published against that map, helping discovery by others wanting to implement it elsewhere, contribute code, etc. Assisting the discovery and amplification of bottom-up ideas is critical as the eHealth space is very much giving its first steps.

So an architectural approach makes it easier to implement, build and fund technology for eHealth. So let’s look at what holds this space back and some potential issues that may crop up by rushing in.

Pitfalls of an architectural approach

These pitfalls are not inherent to any and all architecture efforts, rather, they are risks that can be managed and mitigated. I am sharing them because I’ve seen these sap energy out of what otherwise could have been a great contribution:

  • 1. One Size Fits All / Blueprints with no context: I’ve seen architecture efforts fail because they create blueprints that don’t consider the target context. Think about why a city apartment is different from a beach house, even if they have a lot in common. mHealth solutions will vary country to country due to factors such as different mobile penetration, language and literacy, cultural factors, population distributions. A good architectural approach would consider context as a first-class citizen. A great investment would be to evolve pattern languages for the eHealth/mHealth space, because they inherently bring in context to the equation. This is tough, however, because understanding context requires experience and on-the-ground presence which is expensive, and requires time, and takes away the illusory charm of cookie cutter answers.
  • 2. “Best practices” advertised while the paint is still wet: There is a huge hunger for best practices. In a new field as mHealth, things that work once get a lot of press. I always recommend focusing on proven practices rather than best practices, and evaluating on impact metrics (e.g. birth complications averted) rather than proxy measures such as adoption or usage metrics (“30 users”) or satisfaction (“so-and-so is thrilled”). The latter is especially tough because impact metrics may take months or years to budge, and while subjective evaluation is critical, many organizations work heavily with per diems that distort the value proposition of an effort (For those of you not familiar with the term, a per-diem boils down to compensation as in “If you come and [work with my project] for a day we’ll pay your staff $5 each”. Everyone would agree it’s hard to design compensation for ‘customers’ that doesn’t create conflicts of interest). A good catalog of solutions would be transparent about the impact metrics and evaluation timeframes (it ran for a week, it ran for a year) of implementations or pilots (unfortunately there are a lot of systemic disincentives on all parties involved to publish this information raw).
  • 3. Star charts for the high priests it is common to see an architecture effort devolve into a debate about frameworks, representation and notation, a debate with language and artifacts that only ‘a chosen few’ can understand. Be wary if you see UML diagrams with OCL expressions, or diagrams that claim code generation as a goal. Notations are only useful if they help comprehension. Boxes and lines yay! Make sure the stakeholders can use the artifacts, not just a chose few And don’t be fooled – UML and any specialized notation has been used many times to hide bad thinking behind a veneer of formality. A good architecture effort would communicate in a language and notation that is simple even if not formal. Even better, it would provide a reference architecture and reference implementations as a starting point for common scenarios (“Show me”. Heck, you could even have virtual machines with things deployed and running). In my experience a good set of documents outlining tradeoffs and decision points go a long ways helping implementation, more than a complete Zachman or TOGAF analysis or detailed BPEL workflow.
  • 4. Shipping technology versus building capacity a good effort would specify the relevant skills and communities needed to implement technologies, and pointers on how to get those skills, not just to consulting organizations who can drop-ship products that do the job. For an effort to be sustainable, your users have to understand the goals that the technology supports, and your IT staff needs to understand the technology better than superficially. National or regional labs like InSTEDD innovation labs would be a great asset to the ecosystem of eHealth initiatives.
  • 5. Architecture antipatterns. "an anti-pattern is something that looks like a good idea, but which backfires badly when applied." (Jim Coplien). Sounds obvious one should avoid them but some antipatterns are like flypaper, one keeps getting stuck on them, and they aren’t well documented. Architecture efforts that rely on heavy top-down prescription are very prone to recommending antipatterns as they don’t have immediate feedback loops.Antipatterns are a bit like icebergs: you know they are out there, you can navigate around them if you are watching out for them, but folks are too embarassed to document any close encounters with one To discover these troublemakers early and nip them in the bud, watch out for designs that make sense to engineers but don’t make as much sense to user; or ‘grafting’ that work in other contexts. e.g. A common antipattern is recommending single-master centralized data repositories for information that spans many sectors or agencies. Another one is assuming a process or technology that works for 2 weeks for 20 people can scale to a national rollout. Good architecture guidance would have appropriate risks associated with each capability, validated by real case studies.
  • 6. The Master Data Model (capitalization required). This is a common antipattern, but it deserves its own bullet. The pitfall is assuming you can model the data of a domain a priori, share it across organizations and applications, and then implement software following that model. (e.g. A patient has a first name, a last name, date of birth…) It is possible but very inefficient to do things this way. Creating master data models is a huge temptation amongst folks who have reductionist/mechanist perspectives (and not much enterprise-scale software deployment experience). History has shown that small, flexible standards that can be used together tend to survive longer than larger, holistic standards that cover too much. Think microformats, on standard protocols. Model the interoperability that emerges on the internet, not in large companies. Empower your users to evolve their data models and workflows without having to call coders (if that is too hard, at least make sure local, in-country developers can change and deploy the software)
  • 7. Filtering innovation out The desire to rationalize efforts to save resources can lead to de-duplication initiatives. Reducing duplication can save waste but can also stifle innovation by reducing the chances of discovering new ways of doing things. Many great innovations are recombinations and integrations of things that existed before. A good architecture effort should celebrate multiplicity of approaches and implementations– a better gene pool is more likely to succeed. People shouldn’t be as worried about duplication of effort as they should be about lack of interoperability between projects. That said, the amount of tech efforts in the field that I’ve seen that are funded to be duplicative from day one is staggering, but only depressing when you consider how many don’t interoperate with much at all (sometimes even on purpose).
  • 8. The “open clique”  Any architecture is a like a small language, and any architecture creates an asymmetry, of those who know about it, understand it and are behind it and those who don’t know about it or aren’t quite up to speed. The health and humanitarian space is small and cliques form much more easily than in the commercial space. Architecture can be used to manipulate. It is critical to keep efforts open and the consortiums diverse. (Puppeteer's hand image credit The Godfather logo) An honest architectural approach would be open, and would allow critique, revision and aggregation by parties not involved in creating the original architecture documents. I like the Health Metric Network’s approach to this.
  • 9. Forgetting about your users For a project to be successful you need to understand user priorities and how they experience technology. How many technologies have been inflicted on users because they have the right technical specifications with little regards for the user experience? How many of these technologies that users don’t like have succeeded? With mobile applications, there are many many settings and kinds of users for technology. Making things user-friendly takes more work, especially in the field. User Experience (UX) design plays a critical role in determining how technology can help the users achieve their goals. Yet I have always been amazed how most enterprise architecture frameworks miss user experience and design (or confuse it with usability and requirements gathering). Most arch frameworks are evolutions of mainframe- and client/server-  era learnings generalized and repackaged for the slowly changing architectural and organizational styles used in enterprises. Consider that enterprises can afford to inflict badly designed technologies on their users much more than a ministry of health in a developing country, so I think they are a terrible role model for this particular aspect.
  • 10. Forgetting it’s about community health the health space is littered with technologies and standards that evolved from secondary goals of the industry, that happened to be better funded for IT. E.g. standards for medical record exchange that evolved out of billing reports needed for insurance, or auditing systems that track liabilities of health care organizations but not patients or doctors. Keep the end goal in mind! A good architecture effort would make sure the outcomes and impact are correctly placed. Standards would be chosen based on how well they fit a problem, and catalogued as an implementation choice.
  • I hope this doesn’t sound as complaining. Rather, I am proactively sharing experience for which I have first-hand scars, after having worked in the enterprise architecture space for many years. Actually I’ve been coming back and again the idea of drafting a book on technology patterns for developing countries to share this, but would like to make it a collaborative effort. It is simpler to point out pitfalls than to steer a course that avoids them, but that was not the point of this post. Also, any architecture is a starting point, not an endgame that does the decision-making job for you: it is place from which to begin the conversations. Even with the best architecture efforts, the responsibility of coming up with the right solutions is with the implementers.

    The landscape is improving

    Here are some efforts I like because I think they are taking the right steps to creating long-lasting value. If you know of other relevant initiatives please feel free to add comments below

    Health Metrics Network (institutional/Wikipedia)

    HMN is a multilateral effort supported by funders, WHO and many organizations to define and help implement a framework for health information systems.

    hmn_logo_en

    OASIS: Chris Seebregts and others have been putting together an effort called OASIS to help contribute to this space. I haven’t seen much official content about OASIS yet, but knowing Chris and his deep experience in the field I know that he is likely to endorse things that really work, and has direct access to the ‘proven practices’ in his work on OpenMRS and other technology efforts in Africa.

    (This is not to be confused with the well-known OASIS consortium http://www.oasis-open.org/ which has IBM, Microsoft, Oracle and Sun as founding members)

    Recommended reading

    Monday, December 22, 2008

    Phones don’t change the world, people do

    CD4  counts, maybe someday (pic credit Dave Bullock, Wired)Inspired by the Wired article “Scientists Hack Cellphone to Analyze Blood, Detect Disease, Help Developing Nations” by Dave Bullock there has been a lot of activity under the change.org post “The Cellphone that could change the world” by Nathaniel Whittemore. 

    Nate’s post takes on a ‘remember the future’ approach where he fast-forwards to 2011, and paints a scenario where mobile technologies are widely deployed and used. I really like that approach to visualizing possibility, and wished it was used more as a social activity. Strong Angel and Superstruct do this too, in a way. The realm of the imaginable could be further expanded by more science fiction about community and civilization resilience (This year I enjoyed reading Kim Stanely Robinson’s fiction books about the onset of sudden climate change and the response of a “fictionalized NSF” and a US govt that isn’t afraid to change). But I digress. I liked Nate’s post and the ideas there. The comments were riveting.

    Katrin urged me to engage in the discussion at change.org. Reading through the original post and then through the comments (with a lot of  ‘strong players’ from the mobile applications community), a couple of thoughts emerged about the state of mobile technology applications for health and other social purposes. Here are some.

    In the future…where are the business models?

    today (pic credits Eduardo Jezierski, InSTEDD)

    If you are curious, here is the reality today: In June the week before the elections I visited Zimbabwe. Here you can see a real, resilient, working Guava machine for CD4 counts on the outskirts of Harare. It uses microfluidic technology (for smaller blood samples and reactant costs) and if I recall correctly the operating principle is the same as the phone above which is a tested technology. The thing is solid, and the staff deemed it highly reliable. Calibration was not an issue. They were able to multiply the amount of CD4 lab counts manifold to 300+ per day. I was there discussing the possibility to link to the lab record system, but it wasn’t the highest priority.

    A lot of the discussion did center around how disruptive it would be to have an open platform (open hardware, open software, open assays, open IP on the test methods, open reactant formulas and manufacturing) for these tests.

    Just as a $99 iPhone is a red herring for the phone network costs you are going to pay every year, a cheaper test sensor that becomes widely deployed and relies on proprietary reactants has a hidden, more insidious cost.

    I did not check what are the assays or lab system used by the LUCAS phone in the Wired article, and whether they are open. I just was surprised this dimension wasn’t part of the interview. I encourage Ozcan from UCLA to open-source the hardware specification to allow others to build on it!

    Question: When you plug something in do you say “I’m using electricity” or “I’m using the wall socket”? Sometimes I feel the discussion about innovation in mobile tech sounds like a discussion of innovation in energy…where the discussion centers on the design of plugs & sockets. A phone is just a conduit to a network, and a powerful, sensor-rich, user-friendly device can be underused as a collaboration tool that help people work better together if network reliability and costs are not managed in unison.

    In my 2011, I hope that there are hybrid social-enterprise efforts that can make inroads to working with wireless providers and carriers. They need to evolve their offerings and provide the types of cost structures needed for health and social good to scale and not depend on infusion of donations to keep running OR pushing costs where they can’t be paid while willing customers cant spend their money. Even just helping providers make money differently would help a lot. Examples: toll-free-SMS?  Free-to-send? Free-to-receive? Mobile banking? Shared-costs billing? Provider-supplied location tracking of registered gov’t health staff? Anonimized tracking of random individuals for disease migration modeling? it goes on.. Providers could make more money (gasp!) and they don’t.

    Beyond 2011 I hope more effort gets put into creating connectivity approaches that would be disruptive to current wireless systems. And I mean the “system” of government spectrum licensing + carriers + wireless providers + device manufacturers. But who would fund this research? Sigh…we need smaller, personal, cheaper GSM ‘towers’ that can be linked up more than phones. What would happen if every smartphone could host a 802.13 ‘peer’ network?

    Centralized or Distributed mobile apps? There are no ‘best’ practices…

    There are only proven practices, in context.

    When evaluating whether an approach fits a new situation, you have to consider the context in which other solutions succeeded or failed. I face this all the time in the discussion of ‘centralized’ versus ‘individual’ mobile solutions. Sometimes I get asked which approach is better and the answer is a) it depends b) you want both, not either/or.

    Server side approaches work well with large scale requirements The centralized approach uses national or international-scale gateways, like Ushahidi with Clickatell, RapidSMS, InSTEDD GeoChat with Clickatell and BT, and so on. These are appropriate for national-scale programs, where reliability, performance security and availability of certain types are provided.

    FrontlineSMS is an example of a personal solution FrontlineSMS is the archetypical individual or grassroots approach, where a phone attached to a computer acts as a gateway where you control costs, numbers, location, etc. – providing different types of reliability, performance, security and availability for different contexts. This type of ‘individual’ solution can even run in a smartphone, and FrontlineSMS and other projects are already proposing such a migration. For GeoChat, we put it on the backlog until we saw more demand for this approach from our Asia programs.

    Approaches like RapidSMS which rely on an Asterisk server can also work on a laptop, or on a server, and can help span a ‘middle ground’ between other solutions.

    Scalability is important, but, I see discussions of scalability center around numbers of messages and numbers of registered users which is for most cases profoundly irrelevant. Again, scalability is context-specific; and measured by how well you grow with your users’ needs.

    phone in rural cambodia with structured data. Photo Eduardo Jezierski (InSTEDD) I know a chap –I consider him a hero- who spends most of his month travelling rural Cambodia supporting a national program to send data via SMS using plugged-phone installations. Imagine it: phones with locked enclosures get forced and misused, SIM cards swapped, chargers that burn out, USB drivers that fail, phones that lock up…Support costs of a site are his scalability denominator. For GeoChat, for example, our main scalability metric is latency of roundtrip messages under sustained use (like twitter, responses have to come out fast) across all channels (SMS, email, twitter) under large number of group users and groups.

    But why one approach or the other?

    Some applications support both centralized and decentralized models (like GeoChat) but as we start working together in this budding mobile community it makes sense to pool efforts and re-use each others’ technologies. I don’t see why InSTEDD for example should build yet-another-phone-detection-and-driver layer if other “social good” applications have it. For example, FrontlineSMS can forward messages on to Ushahidi (acting as a local gateway). We will take a similar approach with InSTEDD and should be emulated by the rest of the community. By working on common protocols all our apps could forward messages to each other as required (see this example as a working draft from the Open Mobile Consortium Katrin mentioned) (And Ken, if you are reading this, contributing to FrontlineSMS source was on my last years’ resolutions, and now that we got access to the source code we can really start work on integrating/implementing it with GeoChat, Mesh4x, etc… I’m optimistic about ‘09)!

    rough sketch of whewre it is all going (in terms of message exchange topology, at a very high level)

    The goal is to be able to pick the right tool for the context, and all the applications mentioned above are already working on protocols that would let you have a hybrid deployment that would allow you to scale up or out as needed. As contexts change, having freedom to evolve your app and not be locked into one or another is key.

    Once you are moving messages around, how do you make sure different applications interpret the information in similar ways?

    Shared formats for data exchange

    To achieve interoperability, and reuse the human capital of having trained users, mobile apps should also share conventions on what gets put IN the messages. There is a huge gap in defining what gets put on SMS messages for diverse uses:

    • Free text, with specified language
    • Free text with explicit tags
    • Locations (lat/long, place names, village PCodes etc)
    • Delimited data (e.g. Ed, Jezierski, Cambodia)
    • Self Describing Data (e.g. firstn:Ed|lastn=Jez|city=Seattle
    • Multi-Message batching, sequenced or order-agnostic
    • Message batch retries
    • Compression
    • and the list goes on…

    The community of builders of mobile apps for social purposes has to start catching up in this space. I suggest re-using the leadership of twitter and other services in evolving some conventions (eg @user, #tag) in common ways where applicable. I would also like e.g. Nokia’s data gathering solutions and other industry players e.g. Google to participate in the open forum, too.

    For example, In the Cambodian Avian Influenza hotline pilot we implement batching and self-describing data over SMS. We should get together with RapidSMS, and define a common approach. This would let the Cambodian government switch out InSTEDDs backend and put RapidSMS transparently, if they chose to do so.

    One example of this is GeoChat + JavaROSA. We want to support JavaROSA front ends to send structured data to GeoChat, and if we documented the format well, other clients (like Nokia’s?) or servers could be used interchangeably.

    JavaROSA is an excellent open source project, great technology and well run. We have contributed the ability to do 2-way sync between phones and between a phone and a server, already.

    Even with these agreements interoperability can also lead to a shallow openness, where applications work with others… as long as they can continue to hoard the data and lock-in users. You can see this happening over the last year in the space of social networking technologies, where many announcements of open approaches veil an underlying strategy of trying to become the ‘hub’ or the ‘one stop shop’.

    Do the benefited populations really gain much if folks can collect more data, but we they can’t move it around?

    Sharing Data

    We all know the limits to sharing data are political or incentive-based, more than technical. But technology makes a fine excuse for not sharing information.

    In the field one faces many silos – NGOs with different mandates, Government agencies with different domains (animal health, human health), research programs funded by different ivy league universities, not to mention ethnic, language and country borders.

    This is an area where InSTEDD has been doing a lot of work as part of the Mesh4x project, which basically allows data to be shared two-ways between disparate systems.

    Here are some latest updates

    Leslie (Les) Lenert, Director, National Center for Public Health Informatics, US CDC, puts forward technologies they believe are disruptive. Better devices, data collection, and data sharing:

    CDC Slideby Les Lenert, photo credit Taha Kass Hout (InSTEDD)

    The goal: An Open & Sustainable Platform for the end users

    Ken uses the \o/ logo for FrontlineSMS, a gesture of empowerment. I smile every time I see it.

    We can’t forget that all these technology efforts are trying to empower individuals and organizations, and simplify the work of caring for one’s own community or for others.

    All the teams mentioned here are working together already in different capacities towards this end goal. Resources, timelines, tools are always an issue, but over time things will be more integrated.

    All the technologies mentioned here are converging towards a shared architecture –a platform for data exchange and collaboration built around mobile users in the harshest environments. A platform that can start small and grow transparently, or start large and continue running even if the centralized networks are unavailable. Because of this shared architecture, the end portfolio will be stronger, dollars spent on technology will go further, and users will have a simpler entry point to learn what are the right tools for their context.

    So when a new phone comes out with a CD4 blood cell sensor, its users will know that it can send its data and “it just works”...and then go change the world one CD4 test at a time!

    Thursday, December 11, 2008

    For Geeks: Progress on Mesh4x: Cloud Services, Architecture, Adapters, and Adopters

    As the year wraps to an end we have a mixed blessing: On one side we have a small but growing portfolio of technology stemming from our organization's immediate goals to improve disease detection and public health in South East Asia, being built at a steady pace by our small but ultra-capable team. On the other hand, the scenarios we are addressing are proving to be relevant in all walks of life of the health and humanitarian space, generating an increasing demand and with it, a simultaneous increase in breadth and depth on the demand side. Exciting times indeed!

    Of our main technology efforts (Riff, GeoChat, Mesh4x, TrackerNews.net) Mesh4x (http://www.mesh4x.org) is the one that started getting the earliest deployments to the real world.

    From mesh4x.org:

    “The goal of mesh4x is to provide a portfolio of libraries, tools and applications that simplify using standards-based data meshes from multiple platforms and languages…”

    The libraries can be used right away by developers who integrate them in their own applications, so there was no need for them to wait for a more packaged set of user interfaces and end to end experiences.

    Why it matters and why InSTEDD is working on this

    Data meshes have appealing characteristics for our users, so our contributions to the Mesh4x project are driven by observed data-sharing needs in the health and humanitarian space.

    • Symmetrical: They allow data to exist in a concurrent multi-master environment where updates can be applied at any node in the mesh.
    • Asynchronous: They allow offline updates to information and synchronization with other nodes without requiring data locks, essential for occasionally connected applications.
    • Dynamic: The synchronization can happen even in constantly changing connectivity topologies. I can sync with a server and later the sync can be done between my client and another client, who could then sync with another server if the first one isn't there, and so on.

    This matters to us as these characteristics help information flow and data sharing even in the tough contexts we face:

    • Symmetrical: No organization or application has, de-facto, greater control over information than any other. Symmetry allows power to be shared equally amongst partners, in a true multi-master way, resulting in less hoarding of live data.
    • Asynchronous: Connectivity is an occasional luxury, and the most up to date information is found where it is less likely to have a connection. Storing changes locally and sharing them opportunistically keeps information moving.
    • Dynamic: Connections are opportunistic – you may not have Internet access at all, but you have access to local wifi networks, physical contact with other devices, etc. Data will eventually get to the desired endpoints as it leaps opportunistically between participants.

    Some concrete applications of mesh4x in the space:

    Mesh4x goes mobile with JavaROSA, allows you to sync data on your handset with no Internet

    Mesh4x SMS Adapter: Sync data without an Internet connection

    I have another blog post I should release soon that highlights the proven value of meshes and Groove in the humanitarian space, and my personal introduction to the uses of this architectural pattern.

    But this post is about the progress & directions for the project.

    Cloud-Based Service

    In the last post we mentioned building a cloud based services as a contribution to the space. The demand was for an always-online, cheap to host, simple server that could act as a storage of data and as a relay point for devices connected to the Internet.

    The implementation was embarrassingly simple on Amazon's Elastic Compute Cloud (EC2, a dynamic and virtualized hosting environment) and S3. As a matter of fact, a single Java servlet running on Tomcat + Linux and driving the Java Mesh4 sync libraries ("Mesh4j") provides the heart of the logic. Less code is the best code!

    image We are doing a pilot with the Center for Disease control, synchronizing their Microsoft Access-based EpiInfo application, and they asked if the health surveys they were taking could be automatically geo-mapped as the users synchronized to share their information. This led to incorporate an ontology ("schema") mapping aspect to tell the server "expose a KML feed taking THIS as the title, description, address, and timestamp for the items"

    Taha describes the work with CDC on his Biosurveillance 2.0 blog and why using mesh4x will help them extend the effectiveness of EpiInfo for outbreak investigation.

    We will be opening this service up progressively as we test it out with initial users and tweak it based on their feedback; I hope in a couple of months to have a tested version we can point you to publicly! In the meantime, contact us if you are interested via email or if you are a developer via the Mesh4x.org code project.

    Part of the forcing function for writing this post this week is that we've been chatting with CDC, JavaROSA, and others about these store/endpoint/mapping capabilities and I'd rather we start the collaboration early before we accidentally diverge codebases or approaches.

    Under the Hood

    This is the architecture that the server has been going towards these last couple of weeks:

    AAaagh lots of coloured boxes! a drilldown to what the server architecture is trending to

    Update APIs:

    These allow other applications to change the data in the service. A mesh endpoint allows FeedSync-style updates, but we'll add AtomPub for simpler edits via http POST and other RESTful verbs that are easy to manage from Javascript or are useful if you don't need the full power of the mesh. A JavaROSA endpoint will allow the right metadata to be exposed to JavaROSA or AndroidROSA handsets, and accept updates.

    The GeoChat and a FrontlineSMS bridge would allow message forwarding and sending semistructured data directly in via SMS.

    Storage:

    This is the storage layer for all the data and the configuration, security information, etc needed to keep the service running. In our web-based instance, all this data is stored in S3, but if you wanted to host this in your own office or in a clinic, it would all be sitting inside a MySQL instance. As a matter of fact, all the mesh4x services' information is managed by mesh4x itself, so the actual configuration data is stored via an adapter.

    Ontology Extraction:

    Our service differs from a database in which you don't need to tell it the schema of your information up front. As a matter of fact, we would like to know as little as possible about the format of your data. We prefer to let applications change and evolve the data they use without having to ask developers to change database structures or write specific code for each case. But knowing just a little about the structure of your data helps with things such as defining mappings and filters, so we try to infer as much as we can. The Ontology Extraction component allows you to submit RDF-formed information (or XForms-based or other any other formats that has a transformer) and we keep track of (for example) what fields make up your entities. If you supply such ontologies yourself (in RDFS, or an XForm Definition)we keep it around, too (e.g. 'Patient Date of Birth is a Date/Time field' ).

    this thingie is supposed to represent an RDF triplet Internally, we are using RDF as the default standard to represent data and ontologies. RDF has many properties that make it the simplest appropriate choice, but that would be the topic of a whole different post in of itself.

    Ontology Mapping:

    Ontology Mapping allows us to map fields and entities of different ontologies to help us make sense of your data. For example, to do nice map of your data we need a title and a descriptive summary, a position, and a timestamp associated with the entity. Which field should provide the timestamp? Which address or coordinate fields should be used to put an item on the map? How should the description be composed from from the data? Mappers allow us to do this, and in a future through the user interface you will be able to define these yourself.

    Filtering:

    Filtering is essential in a mesh where little devices and big devices coexist. You could have refugee records for a whole country in one mesh4x mesh, but on a mobile phone you'd probably only want to keep a subset of that. As soon as we expose filters it will be easy for a phone to say 'I work with patients in village X' and just sync that subset of data.

    Format Transformers:

    Format Transformers are components built to translate data into specific formats. GeoRSS and KML are standard formats for representing information with geographic aspects to them. You can see the KML in Google Earth, for example, and items would appear on the map as people sync their data to the server.

    Transformers for XForms Models and XForms form allow us to translate the information of your entities and their ontologies into XForm formats. We see the utility and the pragmatism of XForms models as a way of exchanging records and to define the UI model of the forms users see in XForms, so these transformers allow us to go from our internal RDF-centric representations to these broadly adopted formats.

    Sync Adapters:

    Finally, you have all this data here, but you probably want to work with it elsewhere! Folks have suggested/requested the following as potential endpoints for the data:

    • Google Spreadsheets: we have a Microsoft Excel adapter, so why not a Google spreadsheet one? Imagine creating a form, having it fill out a spreadsheet with gadgets for analytics, and then. Google spreadsheets are also great when lots of people online have to work live on the same data.
    • Zoho is coming up with lots of useful applications. Imagine synchronizing your Zoho app with a table in your MySQL or MS-Access database.
    • MySQL: a lot of websites out there -for good or for bad- run with their MySQL instance exposed on an open network port. Someone we were working with in Mukdahan, Thailand (a 12-hour truck ride from Bangkok), asked the simple question: if I give you my connection string, can you just put the data there for me? Seemed simple and straightforward, so we will line it up in front of other needs!

    Together with running sync adapters we will have to have some user interface to schedule these updates, define mappings between schemas/ontologies, and resolve conflicts. A nice UI for this may end up taking a big pat of the project effort, so if you can reference us to open source projects that do any of this or want to contribute, don't be shy!

    These mappings are part of the mesh too, so in a future (assuming anyone requests InSTEDD or contributes the source) you could be offline and mark an excel spreadsheet as 'shared' and when you sync, not only the data would travel back and forth, but the server itself could create a Google spreadsheet endpoint (or something similar) with the same information for others in your team to use!

    Putting it all together

    In my next post I am explaining how all the pieces of the Mesh4x project come together to help data integration of disparate systems and helping connect these applications into a synthetic whole, instead of having dozens of islands of information.

    More information

    http://www.cdc.gov/epiinfo/ EpiInfo is CDC's outbreak investigation surveying tool. You can participate in their Open Source project on CodePlex: http://www.codeplex.com/EpiInfo. We are working with them to enable synchronization over the cloud of their MySQL/Access based tool.

    ....And recently had a release, announced hours ago. Congratulations to the CDC team!

    Friday, October 10, 2008

    Mesh4x goes mobile with JavaROSA, allows you to sync data on your handset with no Internet

    The latest batch of advances in mesh4x and JavaROSA allows you to do forms-based data input and editing on any java-enabled phone and synchronize with other phones or a server.

    JavaROSA forms UI, in action. I'm in ur phonez, filling formzYou can define a generic form, load the form definition to any  java-enabled cell phone loaded with the JavaROSA forms clients extended with a mesh4x transport component, do data entry in your phone and synchronize the data 2-way with a server or directly peer-to-peer with another phone handset.

    all types of sync options The form definitions are saved and exchanged as XForms, and the data as XForm models. The data can be exchanged over http (if the phone users can afford GPRS and have a data connection) or over compressed SMS messages. This can even happen between phones directly - you enter the phone number of another handset running the app and press "sync". Tondat describes this in detail in his latest blog post. The clients depicted here look awful on the emulators as they use J2ME Polish ( http://www.j2mepolish.org), which then makes the app look great on specific handset models and adapts the UI to the capabilities of each phone.

    This extends the scenarios of JavaROSA- from data-entry bringing it closer to a collaboration tool, where the information being entered can be edited by multiple users and shared from a central database back to the phone in the field.

    This contrasts with "data collection" pattern of data entry solutions...if you believe information is power, data collection creates a vast vacuum cleaner shifting the balance of power: away from those in the field who understand the data the best and can act on it the soonest, towards the center. But does it need to be this way?

    At InSTEDD we look at information flows such as those and ask  people working in health in developing countries spend a large proportion of their time filling in forms that then go somewhere. What information would they want in return? ourselves: What information should be flowing back to the field? How can the person at A work better with the person at B beyond just sending data? How do we shift from 'sending' data to sharing realtime and enriched information two ways? The mesh4x + JavaROSA effort is addresses some of these questions.

    This was made possible through a collaboration and set of code contributions we had with the JavaROSA team. JavaROSA is an implementation of OpenROSA which could become a strong player in the mobile data gathering and sharing space in the near future. Kudos to Clayton Sims, Jonathan Jackson, Andreas Kollegger, and everyone else from the javaROSA team for your work and friendly attitude!

    The XForm definitions are stored in an http service behind a REST API (http://sync.instedd.org/ which is a strawman of a mesh4x cloud-based service. If you played with our map-sync technology you have already used this service).

    kind of like this

    Next Steps

    Our strategy with mesh4x is to contribute code to existing projects being deployed in the field that need 2-way synchronization, data exchange over SMS, or multi-master storage based on standards. Episurveyor, Gather, Pendragon etc come to mind.

    Within this directive, our roadmap on mesh4x will involve effort in four areas:

    • Cloud Services
    • Data Standards
    • Client Applications
    • Transformers and Adapters

    1. Cloud Services: A scalable server implementation supporting security standards . We have a skeletal solution built in C# that we grew from the 'sse' open source project in Codeplex, (which has moved to http://mesh4x.org as well). We host an instance at http://sync.instedd.org/. But it uses a relational database, so we Amazons web services include VM Hosting (EC2) Storage (S3) and Message Queuing (SQS)would have to change the storage layer if we wanted to grow it    for real. So what are our options? Java on EC2/S3  seems to be the shortest path given the code Google App Engine - coolest logo in town!we already have in the project, but Python on Google App Engine sounds enticingly simple to  maintain and scale, at the expense of initial effort to port the sync libraries to Python. Which seems a unnecessary until you consider that Inveneo, the African Access Point and other platforms prefer Python or Ruby for a bunch of good reasons. We'd like your input - .NET+MySQL, Java, or Python + GAE?

    2. Default Data Exchange Standards: Using XForms is simple and works for easy scenarios.  this fun little thing looks like an RDF triplet, of sorts.We'll advocate use of RDF when XForms will fall short, but by all means we wanted to avoid a custom/ad hoc way of defining a typed dictionary 'schema', versioning of that schema, and of encoding entities following the schema.

    Following a clear set of standards for data formats will allow easier mapping of information from one system to another, and the creation of tools that allow end-users to define how their systems integrate.

    Isn't this obvious? Defining which standards to support early in the process is critical because it is easy to reinvent the wheel in this space. Even accidentally. Anyone who can code their way out of a paper bag can define a custom way of serializing dictionaries (a collection of names and values such as name:Ed, country:Cambodia) and define a schema model for it in an hour or so. But inventing one just tends to lead to incompatibilities in the long run, and lack of interoperability in humanitarian systems is an obstacle that anyone with experience has seen get in the way of collaboration and data sharing. It is much smarter to support a well documented subset of a standard such as RDF or XForms -and define extensions as needed (both standards allow schemas/ontologies to be extended). If we can we pledge to play along, applications from multiple organizations will add up to be 'more than the sum of the parts'

    3. Client Applications: We desperately would like to implement or contribute to a stand-alone fat (aka rich) client that you can use on your desktop to synchronize two data endpoints. Ideally this client would allow you to set the endpoints for synchronization, mapping of data schemas, filters, and managing conflicts - in a secure and easy-to-use interface. Any pointers?

    4. Transformers and Adapters: There are many existing applications out there that do their work very well. Sometimes two applications serve similar purposes for different audiences or contexts. Sometimes new applications have to coexist with politically entrenched older systems. While we are building common-purpose adapters to mesh4x (such as Hibernate, Java RMS, and KML which we already have but also CSV and google spreadsheets for example), we already hear demand for specific adapters that take into account particular needs of real-world applications that are already deployed in the field. Which systems should we start with? We have been approached with questions about mesh4x and OpenMRS (http://openmrs.org/) or Sahana (http://www.sahana.lk/). 

    We'd love your input!

    Tuesday, September 30, 2008

    Phnom Penh Innovation Lab team giving its first steps!

    After months of work in the region, our technology team in Cambodia has started their daily work! We had our first standup meetings last week!

    As part of InSTEDD's strategy of 'sustainable innovation' we are creating a full engineering team that over time owns and reinvents technologies used in the region. All technologies go obsolete - so for true sustainability you need to assemble a team of people that will invent the 'next thing' - and give it the skills, capital and opportunities to do so.

    It's one of those rare, beautiful moments in the professional life of anyone: seeing a team's first day, the getting to know each other, starting to create a work culture, picking a set of small challenges and taking them on. Some moments stick - first standup, seeing the first code checkin notification, hearing the first idea that is "so obvious and locally appropriate yet no one in the global team had thought about it".

    It starts with the people, so here they are:

    Day one. Each t-shirt has a long story behind it too. Chris di Bona gave me the google t-shirt not long ago. Channe is wearing a Microsoft Developer Division all-hands tshirt 'Your Software, our Passion'. Saravann is wearing a Clarius 'automate your work' t-shirt from my Software Factory days, and Tola has an Oredev/Expertzone t-shirt from a conference I spoke at once in Europe.

    From left to right we have Sopheap, Channe, (myself behind), Sodany, Laura, Saravann (below) Miguel and Tola.

    Mann, Channe and Sopheap. Ot-Painha-haa!

    Here are Mann, Channe and Sopheap

    Channe Suy


    I am interested working with object oriented technology such as Java and C# . Besides work, I like traveling to the mountain area or to the beach. As I start working with InSTEDD, I would like to learn more about good patterns and practices, improve communications with users, and project management.


    Sopheap


    I'm Sopheap, working as software developer. I am a third-year-Student at Royal university of Phnom Penh (Computer Science). I am interested in .Net and Java, and I spend my day and time on both technologies. I always do the research in the library or reading e-books and sometimes take a course related to the topic. I spent some time learning the Google technologies too. As a member of the InSTEDD Team I want to improve my ability with .Net (C#), and Java languages. Outside of work I like to spend time reading and researching, and sometimes I spend time with friends at the coffee or at the countryside with the fresh and green views.


    Mann (Lim Chanmann)


    My name is Lim Chanmann, but just call me Mann. I have been working on web-based application development with ASP and ASP.NET with C#. And now here at InSTEDD I am interesting in OOP, OOD, software design patterns and best practices, and project management as well as the latest technologies. I spend my free time swimming, and chatting with friends or sometimes with a stranger so I can learn something new.

    The QA team (and Daniel behind)

    The QA team and Dany in action - behind is Daniel from http://www.ideapreneur.net/, who slept over at the Lab around Barcamp Phnom Penh.(more about that soon).


    Miguel Collantes (QA Manager)


    I have 10 years experience in Software Quality Assurance (SQA) during which I developed management tasks and management of offshore teams. I'm working as QA Manager for InSTEDD. My challenge is to accomplish the expected goals while building and working together with foreign teams from very different cultures.

    Laura Fricke Weinberg


    I have 5 years experience in Software Quality Assurance (SQA) during which I developed testing tasks such as data set generation, test cases and test plans creation and QA team leadership. I am currently working with InSTEDD as QA Engineer and have great expectations in improving my scope of knowledge in QA for other InSTEDD software and technologies, as well as teaching and learning every day in our Cambodian Innovation Lab.


    Saravann Paol


    Hi! I am Saravann- I worked as a Computer Operator for Digital Divide Data Organization, and I got started with InSTEDD on Friday 29th August 2008, It is the first job for me that I have opportunity to work with a large team. I'm the third year student at International Institute of Cambodia (Computer Science). I am interested in my position because it can help me to learn new technologies and know more about the diseases and disasters happening in the world. These are the big problems that all the people in the world have to know and learn. I was really happy on my first day, studying with Miguel and Laura who are very good teachers, friendly and good communicators.
    I hope that when I finish my studies I will have new knowledge and skills to improve my personal needs and my work.


    Ung Tola


    My name is Ung Tola. I worked at Digital Divide Data Organization for over two years. Now I got started with InSTEDD on September 1st 2008. I am a third-year student at Norton University and my major subject is English for Teaching. I'm also interested in  the Internet and new technologies being used in the world. I am happy to spend my time with InSTEDD learning new software skills and preparing products that will be used for people helping with diseases and disasters.
    In my free time I like having small picnics in the country side.


    Sodany Chap

    I am from Kampong Cham Province and graduated from the Royal University of Law and Economics in Law field. Then I have been pursuing my study in Masters of Management. I am very excited to have such a great opportunity to access to higher education in this competitive world, and I do wish many Cambodians had at least the same opportunity like me as education plays very important role in the social development of a country like mine.
    So far, I worked and gained some experience from a few NGOs here such as Legal Aid of Cambodia (LAC), Cooperation International (CI), Cambodian Defender Project (CDP) on women trafficking, PILLAP and also from Cambodian Arts and Scholarship Foundation (CASF). 
    I have been working for InSTEDD since mid-June. I am extremely interested in what InSTEDD does. I am now helping with some translation and also learning to be a tester with a group of talented people from DDD and with Miquel and Laura. I have a very strong commitment to effectively and efficiently work on InSTEDD projects to ensure their smooth operation. 
    I'd like to deeply thank to Marry Jane, Dennis, Eduardo, Miquel and Laura and to all of others InSTEED staff who always support and encourage me in implementing my work. Also, let me wish all of them the best health, luck and success in both work and personal life.

    InSTEDD has been able to attract these excellent fellows through - and with the help of- some of our partners in the region, especially Digital Divide Data and Yejj.

    The goal of the QA group is to 'train the trainer' and seed a full QA unit that can carry this aspect of the software development lifecycle end to end. Cambodia has very little experience in QA and we hope to share a bit of our experience in what's needed to have robust systems deployed reliably to your users.

    Kzu explains KML and Linq at the Cambodian Ministry of Health The development team and our Product Manager have engaged with the Cambodian Center for Disease Control and now have a prototype of a mobile application used for hotline call tracking, that then submits the information via batches of SMS messages onto a desktop with a phone plugged in, exports data to excel and posts it to an online Riff instance where the calls can be classified and collaborated on. All this is open source and was done in an agile fashion with weekly iterations and they recently refactored the code to design patterns such as MVP.

    Nico di Tada does a hands-on TDD workshop in the InSTEDD innovation lab. Red-Green-Refactor The skills gained and experience with concrete technologies (SMS-based applications, RESTful web services) will be useful beyond this particular system. Plus, Daniel Cazzulino and Nico di Tada have been giving workshops here in Phnom Penh covering topics such as REST architectures, TDD, KML and Linq.

    Training, roundtables and architecture & methodology discussions are a key part of life at the InSTEDD lab. We don't have enough furniture yet to accommodate a lot of visitors but as soon as we figure out these logistics issues we'll be posting the schedule online and take an 'open house approach'- if you show up, you can participate!

    PS This is just the initial team - we are still hiring for QA Engineers, Graphic Designers, Software Developers, Test Leads, Test Manager and ICT leads here in Phnom Penh. Contact me if it sounds interesting!

    Tuesday, July 08, 2008

    InSTEDD Presentation at HISA

    Here is the presentation we gave at HISA.

    • Brief intro about InSTEDD,
    • An overview of information flow challenges in health we found in Cambodia which we hear are also present in other contexts,
    • How collaboration can help with those challenges, and concretely, what are the technologies InSTEDD is focusing to help with that collaboration and information flow,
    • A quick overview of method: Agile practices, trying to be a good OSS neighbor, and the innovation lab we are building in Cambodia to bring the field needs and local creativity into the very first steps of future tech development.

    I believe any sustainability planning is at its core an exercise in business modeling. At InSTEDD we think one way we could attain this elusive sustainability is to shift focus from having beneficiaries sustaining external efforts, into creating an environment with the capacity to generate and grow new innovations. It's harder, and there's no silver bullet, but still worth learning to do right.

    PD: This first slide always gets folks' attention, by design...

    I think some slides had issues converting, if you run into trouble please let me know and I'll fix it.

    From the slides you may wonder what is the status of the tech we have been working on?

    • Mesh4x has been extensively blogged about, with its recent addition of an adapter that lets you sync via SMS messages.
    • Geochat (Overview, Details and source)  we've demoed chunks of it, but after Myanmar and the Golden Shadow exercise we knew we had to go back to the drawing board with the UI and some aspects of the infrastructure. We'll be blogging about this soon, when the UI allows again the end-to-end scenarios folks expect.
    • Riff allows you to create public or private groups for collaboration around information streams by adding metadata to items, analytics and visualization capabilities. Much blogging needs to happen about this project. We have two interns for Trinity College working on the machine learning aspects of the project under the guidance of Taha Kass-Hout and Nicolas di Tada and the contributions have been fantastic. We even have an early SDK that Olaf put together while working with InSTEDD that simplifies how to build modules that extend Riff. We haven't shown because the UI has big (massive) room for improvement (in other words it's quite terrible right now in relation to the potential of the tool). Mea culpa. But folks who have seen it tell us it will be worth the wait if we do a competent job at the user experience.

      On a side note I am off to Foo Camp this weekend under the generosity of Tim O'Reilly, where I expect to learn a lot, and after that I'm straight off to Phnom Penh to continue the hiring process and setting up our innovation lab.

    Monday, June 23, 2008

    Mesh4x SMS Adapter: Sync data without an Internet connection

    Mesh4x has a new feature that allows you sync data between a local desktop, server or mobile device and a remote computer even if you have no Internet access, by sending and receiving little batches of text messages. Databases, spreadsheets and even maps can be kept up to date using the right adapters. Algorithmic work was done to minimize the number of text messages needed, and the result is having up-to-date information on both ends of the exchange. This data can be in turn shared further with other devices locally and synchronized again to the remote source.

    Scenarios

    OpenMRS (http://openmrs.org) is an open-source Medical Record Management system used extensively in africa (Tanzania, Rwanda, Malawi, Zimbabwe, Kenya...) and increasingly in the Middle East and Americas (Peru, Honduras and Haiti come to mind). OpenMRS is used to improve patient care and simplify the records management at the clinic where it's used. It is common for these clinics to have just one computer and have no internet connection. Cell phone coverage can be present, ranging from reliable to dodgy for voice (just 1 bar of signal is typically reliable enough for SMS, but terrible for voice or data).

    A rural clinic in Rwanda, photo credit Neal Lesh of the OpenMRS community

    There are two sync scenarios I heard about this week talking with the OpenMRS and OpenROSA teams that Mesh4x addresses. (Note - we haven't planned to do this work yet I'm just using these scenarios as concrete examples of how mesh4x over SMS can help in the context of medical record management)

    Scenario 1: OpenMRS to OpenMRS sync

    The clinic is updating patient records that need to be kept up to date with the province-level hospital. In this case the clinic has a computer under a desk with a cell phone reliably plugged into it, and periodically, it would sync with a similar setup in the province level. It could also go straight up to central and then down again to the province level, as province hospitals do tend to have connectivity.

    Mr. Vanra Ieng shows a nift enclosure that makes sure the phone plugged in to the computer will be reliably working!Here you can see Vanra Ieng from the WHO/Ministry of Health in Cambodia showing a physical enclosure that makes sure his phones - used in a similar setup, as an attachment to a computer- don't get unplugged from the PC or power, and are used for 'intended purposes' only (people have personal phones and other means of communication as well, and he needs to make sure it keeps running as this is for a pilot on sending disease indicators from key districts to central level).

    Scenario 2: OpenMRS to mobile data gathering client

    javaROSA is an open source mobile client built in Java that is used for XForms-based data collection that works on lowest-common-denominator phones as well as PDAs. You can fill in the forms and send the data via Infrared, bluetooth, http (If there is GPRS available)

    If I understood the conversations at the HISA meetings,they are working on a feature to send data one-way via SMS messages (serializing objects and sending them over a set of messages). With the SMS adapter, community health workers could be taking data on their mobile devices and updating centralized computers, as well as getting the latest information on the device nd updating their local information by querying for the information of patients they hadn't seen before but are facing now, or patients that have visited the clinic since the information was taken. In addition, they could even beam (SMS) information with a colleague directly, phone to phone.

    In each scenario, though, how many text messages are we talking about? In our tests, starting with a large up-to-date dataset (a KML map) and added a "pushpin" with a relatively long description. It required a grand total of 8 text messages. This includes all the steps needed to compare versions on both sides of the communication, and send the new pushpin over (see Under the Hood for more details).

    If there are more items that have changed, and the larger the items themselves, the more messages are required to transmit them, of course. But we think this is a very low baseline considering the outcome: up-do date information on both sides that can, in turn, be shared with more devices locally using even more economical means such as infrared or bluetooth.

    Under the hood

    So what does it take to synchronize data over text messages? 

    1) We need to be able to send/receive SMS messages from a phone via a USB cable. In the code we abstract this behind a provider model, and the default implementation will be based on SMSLib. We envision in a future a server version, potentially using BT's web21c infrastructure to do so.

    2) The mesh protocol must be reduced to a bare minimum so it is efficient to use over tiny and unreliable text messages. We do so by combining exchanges that achieve the following:

    1. A collection-level check: is any sync needed?
    2. Item-level checks: which items have been added, updated, or deleted relative to the version information available locally?
    3. Item exchange - 2-way sending and receiving the changed items themselves. Originally we were zipping the data and sending that over if appropriate, now we are using a variation of the RSync algorithms which use creative hashing (math operations on the data) to send the minimal information over.

    3) SMS is an unreliable transport and as such there is a layer in the code that compensates for this by managing message batches. A batch allows us to split up a large payload into text messages to reconstitute on the other side, tolerating messages coming out of order, dealing with lost messages, and timing out on operations that have taken too long to complete.

    It is important to understand that the goal of this adapter is not only "sending" the data for a new item or "receiving" it - This adapter checks for which items to send/receive and also sends/receives the full versioning information for the item. That makes it possible to keep sharing it with other applications and users while maintaining the ability to reconcile updates and detect version conflicts.

    A big kudos to Tondat who has been moving at warp speed with this codebase. The first checkin was on June 9th! The quality of the code is very high, and the ingenious use of gzip, Base91 and Rsync shows . Check out the source.

    Next steps

    • Finish the optimization of the FeedSync protocol (which Mesh4x uses under the covers) in edge conditions (e.g. sharing conflicting payloads).
    • Implement the SMSLib adapter and test it well with a couple of appropriate phones.
    • Add this capability into the demo Java application that is used to demonstrate the KML adapter. This will let you specify a phone number in addition to the http URL and the file path in the sync endpoint box.
    • Further optimize formats, encoding and memory usage.
    • Pursue collaborations with openROSA/openMRS that resonate strongly with the community needs we are see in South East Asia. If you think of any scenarios where this could help your technology please share them here!

    References

    http://mesh4x.org
    http://en.wikipedia.org/wiki/Rsync