Shared posts

09 Apr 17:13

MongoDB foreground index builds

This one is explored in depth over at nelhagedebugsshit.tumblr.com.

MongoDB has two different approaches to building indexes, selectable by the operator at index-creation time.

  • ā€œforegroundā€ index builds lock the the entire server for the duration of the build, but promise faster builds and more compact indexes, since they can look at all of the data up front.
  • ā€œbackgroundā€ builds are slower and lead to more disk fragmentation, but allow other queries (reads and writes) to procede during the index build, by building the B-Tree in parallel with ongoing operations, and merging new writes into the index-in-progress.

Itā€™s a fine theory. But until MongoDB 2.6, foreground index builds were actually dramatically slower on large collections!

It turns out that background builds did the obvious naĆÆve B-Tree construction, of just creating an empty B-Tree and inserting each record one at a time. This approach has some weaknesses, but itā€™s pretty clearly O(n*log n): O(n) inserts, each of which is an O(log n) tree insert.

Foreground builds, able to stop the world, tried to be clever by doing an external-memory sort, and then building the B-Tree in place on top of the sorted records. This approach is in fact much faster if implemented correctly, since it reduces disk I/O (even though it will also be O(n*log n) comparisons asymptotically ā€“ youā€™re still doing a comparison sort, after all).

However, the MongoDB 2.4 and earlier external sort misses this goal! The sort divides the N records into (N/k) chunks, for a roughly-constant k, sorts each chunk, and then merges them by repeatedly scanning each chunk for the global minimum. But for constant k, N/k = O(N), and so it ends up doing N * O(N) comparisons, for O(NĀ²) work. As long as N is small enough that N/k is small, it does beat out the incremental B-Tree build, but somewhere around 1M records it tips over, and gets rapidly slower.

Mongo 2.6 fixes this by using a min-heap to do the merge, restoring O(n*log n) asymptotic performance, while retaining the I/O efficiency of the external sort. Hooray.

09 Apr 15:01

006 - Rat & Co x Ben Thomas

by Create Explore
Dmitry Krasnoukhov

ā¤ļø

Listen to 006 on Soundcloud

16 Dec 08:40

viaĀ kremlint: i got the russian spacecraft simulator...



















viaĀ kremlint:

i got the russian spacecraft simulator working

More informationĀ http://4archive.org/g/res/42590842

ā€œSpacecrafts use software to provide an interface to the astronauts. Thereā€™s a training version of this very software which simulates a flight. Itā€™s what you see in the OP pic. It looks like some 1337 haX0r software since itā€™s been develeoped in the 60ies until at least 2002.ā€

11 Nov 17:20

Rubinius 3.0 - Part 1: The Rubinius Team

by Brian Shirai

Today, I'm introducing some big changes for Rubinius. As the title says, this post is only part one. There are five parts and I'll publish one each day this week. I'll be covering these topics: the Rubinius Team, the development and release process, the Rubinius instruction set, the Rubinius system and tools, and one more thing.

Also, as the title says, this is Rubinius 3.0. The past year has been incredibly influential in helping me understand the many facets of Rubinius as a project and Ruby as a language and community. The other posts will dive into more detail, but I want to highlight that all of this is Rubinius 3.0.

Introducing the Rubinius Team

Sometimes we save the best for last, but this is not one of those times. I'm tremendously honored and excited to introduce you to the Rubinius Team. They've all volunteered to contribute their time, experience, and passion to improving Rubinius and its impact to make the world better.

Here we are, in no particular order:

Sophia Shao: As a recent graduate of Carnegie Mellon University's Electrical & Computer Engineering department, Sophia is currently tackling a massive application migration from MRI to Rubinius. She's also been improving Rubinius every day. Hit her up for tips about debugging machine code.

Jesse Cooke: As co-founder of Watsi, a venture to fund healthcare for people around the world, Jesse was part of YCombinator's first ever non-profit. Jesse has been contributing in any way he can to Rubinius for a long time. If you visit Portland, OR, you may see him riding this weird bike with a belt instead of a chain.

Valerie Concepcion: If you're interested in getting things like Raspberry PI's, Legos, and Wii Remotes to play well together, Valerie can help. Drawn to the Maker movement and inspired by her friends who work in non-profits, she is interested in applying technology for social good.

Stacy Mullins: At one point, Stacy would have gladly chosen a typewriter over a computer. But at school for graphic design, she became fascinated by technologies like HTML and CSS and the ability to create something from scratch. Now she's learning about crafting code and communicating well with other developers.

Yorick Peterse: When not breaking code, Yorick is fixing it and asking questions. Either way, there is a lot of code happening. He's drawn to the deep technical details of systems like just-in-time compilers and concurrency. He may or may not be a Dr. Evil character hatching plans for world domination.

Brian Shirai: Having once passed over Ruby for being too much like Perl, Brian rediscovered Ruby over ten years ago and has been working on Rubinius for the past eight. Inadvertently, he's also learned Perl.

Why?

After all these years, why do I want to form a Rubinius Team? And what is it? Is it like the "core team" we see in Rails or other projects?

I'm so glad you're wondering about that!

Early in the Rubinius project, Evan Phoenix started a policy we called "the open commit bit": if we accept your patch, you get permission to commit changes to the source code repository.

This contrasted with many open source projects that had a small number of people who could make changes to the code. Usually, this group was called a "core team". Limiting permission to change the code was seen as an essential part of maintaining code quality. If anyone could commit, people would just make a big mess.

This conventional wisdom turned out to be false. We let anyone who made one good patch have access to commit any changes they wanted. In practice, almost everyone was extremely careful. We rarely had to revert changes, and when we did, it was not usually a question of quality. Hundreds of people committed changes and Rubinius benefited a great deal.

For this reason, whenever the topic of a "core team" for Rubinius came up, Evan opposed it. There was no real value in trying to be gate keepers. Giving people the opportunity to contribute and welcoming them to do so had a positive impact and showed appreciation for their efforts.

The Rubinius Team is not about creating a different class of contributor, exclusiveness, gate-keepers, or overseers.

Another characteristic of the typical open source project "core team" is that the members are usually the most technically skilled and have the greatest number of commits. This automatically creates an imbalance of emphasis on only technical issues and technical expertise, despite the fact that the vast majority of people using, contributing to, or impacted by a project will not be "top technical contributors".

The Rubinius Team is not focused exclusively, or even primarily, on the technical aspects of the project.

A third characteristic of typical "core teams" is the implicit privilege of the members and the resulting economic, gender and diversity imbalance. Someone struggling with two jobs won't have time to be a top committer, no matter how capable they are. Likewise for someone caring for kids at home, a responsibility that disproportionately rests with women. All of these problems stem from the dangerous fallacy that open source software is a "meritocracy".

So, what is the Rubinius Team?

The Rubinius Team is a group of people who work together, influenced by our values, to accomplish things that fulfill the Rubinius vision and mission.

Our vision is a world where Ruby is the most useful programming language for building things that improve people's well-being and quality of life.

"Most useful" means the most benefit for the least amount of effort for the greatest number of people. There will always be incredibly smart people who do very difficult things. For the rest of us, to steal a quote by Moshe Feldenkrais, we want to "make the impossible possible, the hard easy, and the easy elegant".

Our mission is to build the best Ruby implementation and the best programming tools that benefit the greatest number of people, prioritizing our efforts to improve access for people who have been marginalized and excluded.

We value impact, quality, inclusiveness, diversity and balance, and we actively promote them. We celebrate our differences and appreciate them as a source of strength. We prioritize improving access and championing the needs of people who have traditionally been excluded. We get things done, lead by example and we constantly strive to improve. We realize that we enjoy a lot of privilege and we work hard to empower others rather than advancing our own interests.

We welcome anyone who shares our vision, mission and values to be a part of the Rubinius Team. And one of our objectives will be growing the team. There are many roles to play. From outreach to industry, academia, and communities like Women Who Code and Black Girls Code to marketing, budgeting, and planning. From documentation to organizing meetups. There are many ideas we don't even know about yet, and are waiting for you to create.

It's about quality

I want to talk more about the over-emphasis of technology in open source projects because I don't hear this discussed often.

The source code written is a small part of a much bigger picture. The purpose of design is to create something that is useful for humans. Better understanding leads to better design. Better design leads to a more effective tool. A more effective tool leads to better engagement. Better engagement leads to greater understanding. There is no hierarchy here; there is no ranking. They form a circle of interaction. Each of these is important, and any one of them is only as good as all the others.

We strive to ensure that we are reaching the people we want to help, and that we are helping the people we want to reach. We do this by seeking global understanding of the problems our community needs to solve. Too narrow a focus on the local technology problems will mislead us.

Pondering these matters leads us to consider the Rubinius community.

The Rubinius Community

I have a very broad view of the Rubinius community. It includes developers and people learning to write Ruby. It also involves people who are not primarily involved in programming but may need to understand or even write some Ruby code. For example, a database developer working with a team of Ruby programmers on an application. The community also includes the people who use the software written in Ruby. And it includes the businesses who employ people to write in Ruby.

The Rubinius Team is also a part of the Rubinius community. The relationship between the Rubinius Team and the Rubinius community is important. The Team's purpose is to help the community. And here, "help" means to serve.

In business the people we serve are our customers, but the concept of a customer is not common to open source projects. Since people do not usually pay for open source software, the idea of a customer does not seem to make sense. However, envisioning the user of Rubinius as a customer has many benefits. To develop an effective product, we must deeply understand the needs of a customer.

The customer relationship provides important benefits to both sides. On one hand it clarifies who we, the Team, are trying to help and to whom we are responsible. On the other, it makes clear the customer's responsibility to engage and communicate clearly, and to provide feedback to help us improve. Both sides must be vested in the relationship.

This is where the analogy of a typical business relationship begins to break down when applied to open source. We provide a thing of value: Rubinius. What thing of value does the customer provide in return? One thing is the person's time. Taking the time to try Rubinius, open an issue, or share their experience with someone else is a thing of value they are giving Rubinius. However, there is not yet a thing that has the same tangible value as money. When we are asked to pay money for something, it increases the stakes for us.

We want the Rubinius community to be healthy, inclusive, safe, and helpful. We want people to learn and grow and build awesome things. So we are adopting a Code of Conduct for the community based on the Citizen Code of Conduct by the excellent Stumptown Syndicate. We know this will be an important aspect of creating an environment of respect and support as we continue to explore how to improve the relationship between Rubinius as a product and project and those who use Rubinius.

I'm excited to share more about the path of Rubinius 3.0 in the other posts this week. We'd love to hear from you. Please send your comments to community@rubini.us.

Acknowledgments

I want to thank the following people: Ashe Dryden and James Coglan, who have forced me to question many things about open source projects. Evan Phoenix for starting and leading Rubinius. Chad Slaughter for taking a risk and being a stellar mentor. The Rubinius Team, Sophia, Jesse, Valerie, Stacy, and Yorick, for their generosity and feedback on the post. Joe Mastey and Gerlando Piro for their review and many fruitful conversations. Enova, for giving me hard problems to solve. And thanks to you, the Rubinius community, for making it worthwhile.

29 Oct 10:25

This is why I canā€™t have conversations using Twitter

Yesterday Stripe engineers wrote a detailed report of why they had an issue with Redis. This is very appreciated. In the Hacker News thread I explained that because now we have diskless replication (http://antirez.com/news/81) now persistence is no longer mandatory for people having a master-slaves replicas set. This changes the design constraints: now that we can have diskless replicas synchronization, it is worth it to better support the Stripe (ex?) use case of replicas set with persistence turned down, in a more safe way. This is a work in progress effort.

In the same post Stripe engineers said that they are going to switch to PostgreSQL for the use case where they have issues with Redis, which is a great database indeed, and many times if you can go with the SQL data model and an on-disk database, it is better to use that instead of Redis which is designed for when you really want to scale to a lot of complex operations per second. Stripe engineers also said that they measured the 99th percentile and it was better with PostgreSQL compared to Redis, so in a tweet @aphyr wrote:

ā€œNote that *synchronous* Postgres replication *between AZs* delivers lower 99th latencies than asynchronous Redisā€

And I replied:

ā€œIt could be useful to look at average latency to better understand what is going on, since I believe the 99% percentile is very affected by the latency spikes that Redis can have running on EC2.ā€

Which means, if you have also the average, you can tell if the 99th percentile is ruined (or not) by latency spikes, that many times can be solved. Usually it is as simple as that: if you have a very low average, but the 99th percentile is bad, likely it is not that Redis is running slow because, for example, operations performed are very time consuming or blocking, but instead a subset of queries are served slow because of the usual issues in EC2: fork time in certain instances, remote disks I/O, and so forth. Stuff that you can likely address, since for example, there are instance types without the fork latency issue.

For half the Twitter IT community, my statement was to promote the average latency as the right metric over 99th percentiles:

"averages are the worst possible metric for latency. No latency I've ever seen falls on a bell curve. Averages give nonsense."

"You have clearly not understood how the math works or why tail latencies matter in dist sys. I think we're done here."

ā€œindeed; the problem is that averages are not robust in the presence of outliersā€

Ehm, who said that average is a good metric? I proposed it to *detect* if there are or not big outliers. So during what was supposed to be a normal exchange, I find after 10 minutes my Twitter completely full of people that tell me that Iā€™m an idiot to endorse averages as The New Metric For Latency in the world. Once you get the first retweets, you got more and more. Even a notable builder of other NoSQL database finds the time to lecture me a few things via Twitter: I reply saying that clearly what I wrote was that if you have 99th + avg you have a better picture of the curve and can understand if the problem is the Redis spikes on EC2, but magically the original tweet gets removed, so my tweets are now more out of context. My three tweets:

1. ā€œmay point was, even if in the internet noise I'm not sure if it is still useful, that avg helps to understand why (ā€¦)ā€
2. ā€œthe 99% percentile is bad. If avg is very good but 99% percentile is bad, you can suspect a few very bad samplesā€
3. ā€œthis is useful with Redis, since with proper config sometimes you can improve the bad latency samples a lot.ā€

Guess what? There is even somebody that isolated tweet #2 that was the continuation of ā€œto understand why the 99% percentile is badā€ (bad as in, is not providing good figures), and just read it out of context: ā€œthe 99% percentile is badā€.

Once upon a time, people used to argue for days on usenet, but at least there was, most of the times, an argument against a new argument and so forth, with enough text and context to have a normal condition. This instead is just amplification of hate and engineering rules 101 together. 99th latency is the right metric and average is a poor one? Make sure to donā€™t talk about averages even in a context where it makes sense otherwise you get 10000 shitty replies.

What to do with that? Now a good thing about me is that Iā€™m not much affected by all this personally, but it is also clear that because I use Twitter for a matter of work, in order to inform people of what is happening with Redis, this is not a viable working environment. For example, latency: I care a lot about latency, so many efforts were done during the years in order to improve it (including diskless replication). We have monitoring as well in order to understand if and why there are latency spikes, Redis can provide you an human readable report of what is happening inside of it by monitoring different execution paths. After all this work, what you get instead is the wrong message retweeted one million times, which does not help. Most people will not follow the tweets to make an idea themselves, the reality is, at this point, rewritten: I said that average percentile is good and I donā€™t realize that you should look at the long tail. Next time Iā€™ll talk about latency, for many people, Iā€™ll be the one that has a few non clear ideas about it, so who knows what Iā€™m talking about or what Iā€™m doing?

At the same time Twitter is RSS for humans, it is extremely useful to keep many people updated about what I love to do, which is, to work to my open source project that so far I tried to develop with care. So Iā€™m trying to think about what a viable setup can be. Maybe I can just blog more, and use the Redis mailing list more, and use Twitter just to link stuff so that interested people can read, and interested people can argue and have real and useful discussions.

Iā€™ve a lot of things to do about Redis, for the users that have a good time with it, and a lot of things to do for the users that are experiencing problems. I feel like my time is best spent hacking instead of having non-conversations on Twitter. I love to argue, but this is just a futile exercise. Comments
11 Oct 17:15

Amazon DynamoDB Update - JSON, Expanded Free Tier, Flexible Scaling, Larger Items

by Jeff Barr
Dmitry Krasnoukhov

Sounds pretty cool

Amazon DynamoDB Amazon DynamoDB is a fast and flexible NoSQL database service for all applications that need consistent, single-digit millisecond latency at any scale. Our customers love the fact that they can get started quickly and simply (and often at no charge, within the AWS Free Tier) and then seamlessly scale to store any amount of data and handle any desired request rate, all with very consistent, SSD-driven performance.

Today we are making DynamoDB even more useful with four important additions: Support for JSON data, an expanded free tier, additional scaling options, and the capacity to store larger items. We've also got a new demo video and some brand-new customer references.

JSON Document Support
You can now store entire JSON-formatted documents as single DynamoDB items (subject to the newly increased 400 KB size limit that I will talk about in a moment).

This new document-oriented support is implemented in the AWS SDKs and makes use of some new DynamoDB data types. The document support (available now in the AWS SDK for Java, the SDK for .NET, the SDK for Ruby, and an extension to the SDK for JavaScript in the Browser) makes it easy to map your JSON data or native language object on to DynamoDB's native data types and for supporting queries that are based on the structure of your document. You can also view and edit JSON documents from within the AWS Management Console.

With this addition, DynamoDB becomes a full-fledged document store. Using the AWS SDKs, it is easy to store JSON documents in a DynamoDB table while preserving their complex and possibly nested "shape." The new data types could also be used to store other structured formats such as HTML or XML by building a very thin translation layer.

Let's work through a couple of examples. I'll start with the following JSON document:

{
  "person_id" : 123,
  "last_name" : "Barr",
  "first_name" : "Jeff",
  "current_city" : "Tokyo",
  "next_haircut" :
  {
    "year" : 2014,
    "month" : 10,
    "day" : 30
  },
  "children"  :
    [ "SJB", "ASB", "CGB", "BGB", "GTB" ]
}

This needs some escaping in order to be used as a Java String literal:

String json = "{"
        +   "\"person_id\" : 123 ,"
        +   "\"last_name\" : \"Barr\" ,"
        +   "\"first_name\" : \"Jeff\" ,"
        +   "\"current_city\" : \"Tokyo\" ,"
        +   "\"next_haircut\" : {"
        +       "\"year\" : 2014 ,"
        +       "\"month\" : 10 ,"
        +       "\"day\" : 30"
        +   "} ,"
        +   "\"children\" :"
        +   "[ \"SJB\" , \"ASB\" , \"CGB\" , \"BGB\" , \"GTB\" ]"
        + "}"
        ;

Here's how I would store this JSON document in my people table:

DynamoDB dynamo = new DynamoDB(new AmazonDynamoDBClient(...)); 
Table table = dynamo.getTable("people");

Item item =
  new Item()
      .withPrimaryKey("person_id", 123)
      .withJSON("document", json);

table.putItem(item);

And here's how I get it back:

DynamoDB dynamo = new DynamoDB(new AmazonDynamoDBClient(...)); 
Table table = dynamo.getTable("people"); 

Item documentItem =
  table.getItem(new GetItemSpec()
                .withPrimaryKey("person_id", 123)
                .withAttributesToGet("document"));

System.out.println(documentItem.getJSONPretty("document"));

The AWS SDK for Java maps the document to DynamoDB's data types and stores it like this:

I can also represent and manipulate the document in a programmatic, structural form. This code makes explicit reference to DynamoDB's new Map and List data types, which I will describe in a moment:

DynamoDB dynamo = new DynamoDB(new AmazonDynamoDBClient(...)); 
Table table = dynamo.getTable("people"); 

Item item =
  new Item() 
      .withPrimaryKey("person_id", 123)
      .withMap("document", 
               new ValueMap()
                   .withString("last_name", "Barr") 
                   .withString("first_name", "Jeff") 
                   .withString("current_city", "Tokyo") 
                   .withMap("next_haircut", 
               new ValueMap() 
                   .withInt("year", 2014) 
                   .withInt("month", 10) 
                   .withInt("day", 30)) 
                   .withList("children", 
                             "SJB", "ASB", "CGB", "BGB", "GTB")); 

table.putItem(item);

Here is how I would retrieve the entire item:

DynamoDB dynamo = new DynamoDB(new AmazonDynamoDBClient(...)); 
Table table = dynamo.getTable("people"); 

Item documentItem =
  table.getItem(new GetItemSpec()
                    .withPrimaryKey("person_id", 123)
                    .withAttributesToGet("document"));

System.out.println(documentItem.get("document"));

I can use a Document Path to retrieve part of a document. Perhaps I need the next_haircut item and nothing else. Here's how I would do that:

DynamoDB dynamo = new DynamoDB(new AmazonDynamoDBClient(...)); 
Table table = dynamo.getTable("people"); 

Item partialDocItem =
  table.getItem(new GetItemSpec()
                    .withPrimaryKey("person_id", 123)
                    .withProjectionExpression("document.next_haircut"));

System.out.println(partialDocItem);

Similarly, I can update part of a document. Here's how I would change my current_city back to Seattle:

DynamoDB dynamo = new DynamoDB(new AmazonDynamoDBClient(...)); 
Table table = dynamo.getTable("people"); 

table.updateItem(
  new UpdateItemSpec()
      .withPrimaryKey("person_id", 123)
      .withUpdateExpression("SET document.current_city = :city")
      .withValueMap(new ValueMap().withString(":city", "Seattle")));

As part of this launch we are also adding support for the following four data types:

  • List - An attribute of this data type consists of an ordered collection of values, similar to a JSON array. The children section of my sample document is stored in a List.
  • Map - An attribute of this type consists of an unordered collection of name-value pairs, similar to a JSON object. The next_haircut section of my sample document is stored in a Map.
  • Boolean - An attribute of this type stores a Boolean value (true or false).
  • Null - An attribute of this type represents a value with an unknown or undefined state.

The mapping from JSON to DynamoDB's intrinsic data types is predictable and straightforward. You can, if you'd like, store a JSON document in a DynamoDB and then retrieve it using the lower-level "native" functions. You can also retrieve an existing item as a JSON document.

It is important to note that the DynamoDB type system is a superset of JSON's type system, and that items which contain attributes of Binary or Set type cannot be faithfully represented in JSON. The Item.getJSON(String) and Item.toJSON() methods encode binary data in base-64 and map DynamoDB sets to JSON lists.

Expanded Free Tier
We are expanding the amount of DynamoDB capacity that is available to you as part of the AWS Free Tier. You can now store up to 25 GB of data and process up to 200 million requests per month, at up to 25 read capacity units and 25 write capacity units. This is, in other words, enough free capacity to allow you to run a meaningful production app at no charge. For example, based on our experience, you could run a mobile game with over 15,000 players, or run an ad tech platform serving 500,000 impressions per day.

Additional Scaling Options
As you might know, DynamoDB works on a provisioned capacity model. When you create each of your tables and the associated global secondary indexes, you must specify the desired level of read and write capacity, expressed in capacity units. Each read capacity unit allows you to perform one strongly consistent read (up to 4 KB) per second or two eventually consistent reads (also up to 4 KB) per second. Each write capacity unit allows you to perform one write (up to 1 KB) per second.

Previously, DynamoDB allowed you to double or halve the amount of provisioned throughput with each modification operation. With today's release, you can now adjust it by any desired amount, limited only by the initial throughput limits associated with your AWS account (which can easily be raised). For more information on this limit, take a look at DynamoDB Limits in the documentation.

Larger Items
Each of your DynamoDB items can now occupy up to 400 KB. The size of a given item includes the attribute name (in UTF-8) and the attribute value. The previous limit was 64 KB.

New Demo Video
My colleague Khawaja Shams (Head of DynamoDB Engineering) is the star of a new video. He reviews the new features and also unveils a demo app that makes use of our new JSON support:

DynamoDB in Action - Customers are Talking
AWS customers all over the world are putting DynamoDB to use as a core element of their mission-critical applications. Here are some recent success stories:


Talko is a new communication tool for workgroups and families. Ransom Richardson, Service Architect, explained why they are using DynamoDB:

DynamoDB is a core part of Talko's storage architecture. It has been amazingly reliable, with 100% uptime over our two years of use. Its consistent low-latency performance has allowed us to focus on our application code instead of spending time fine-tuning database performance. With DynamoDB it was easy to scale capacity to handle our product launch.


Electronic Arts stores game data for The Simpsons:Tapped Out (a Top 20 iOS app in the US, with millions of active users) in DynamoDB and Amazon Simple Storage Service (S3). They switched from MySQL to DynamoDB and their data storage costs dropped by an amazing 90%.

The development team behind this incredibly successful game will be conducting a session at AWS re:Invent. You can attend GAM302 to hear how they migrated from MySQL to DynamoDB on the fly and used AWS Elastic Beanstalk and Auto Scaling to simplify their deployments while also lowering their costs.

Online Indexing (Available Soon)
We are planning to give you the ability to add and remove indexes for existing DynamoDB tables. This will give you the flexibility to adjust your indexes to match your evolving query patterns. This feature will be available soon, so stay tuned!

Get Started Now
These new features are available now and you can start using them today in the US East (Northern Virginia), US West (Northern California), Europe (Ireland), Asia Pacific (Singapore), Asia Pacific (Tokyo), Asia Pacific (Sydney), and South America (Brazil) Regions. It will be available in the near future in other AWS Regions. Developers targeting any of the AWS Regions can download the newest version DynamoDB Local to develop and test apps locally (read my post, DynamoDB Local for Desktop Development to learn more).

-- Jeff;

26 Sep 13:26

003 - With One Eye Shut X Andrea

by Create Explore

Listen to 003 on Soundcloud.

17 Sep 16:40

Feeddler 2 for iOS 8

by Che-Bin Liu
Feeddler was first released on April 1st, 2010, two days before the first-generation iPad was released. At one point, it was ranked 11th in the App Store across all categories. In the next three years, Feeddler was constantly ranked in top 20 in the iPad News category, and top 30 for iPhone News.

Feeddler was first written for iOS 3.2 to support Google Reader. Over time, the code was upgraded for iOS 4, 5, and 6. There were more than 20 updates. When Google Reader was shut down, Feeddler continued to support a few RSS services.

Apple made a bold change in iOS 7. But for the first time I couldn't update Feeddler accordingly. My wife was fighting a cancer. I had to pause all plans to support her and the family. Luckily my wife has recovered well after all the chemo treatments, surgeries, and radiation therapy.

In the past few months, I have rewritten most of the code for Feeddler. The new Feeddler uses the similar UI layout with the modern iOS design. New features include Feedly support, full screen reading, full text mode with Readability, and GIF image support among many others. Test users have praised the new app for its improved speed and reliability in iOS 8.

I have submitted Feeddler 2 to the App Store and hopefully it will be approved in time this week for iOS 8 and iPhone 6 release. Feeddler 2 will be a new app. The Pro version is priced at $4.99 again. If you are not ready to pay (again), I invite you to download the free version for Feeddler 2. I decided to make most Pro version features available for the free version this time.

I hope you will like the new Feeddler and appreciate your continued support. Thank you!

14 Sep 13:26

The Tilde Art Project You Have to See to Believe

by Tom Dale

Did you know that you can exchange money for goods and services? I did, but for some reason it still never occurred to me that I might be able to pay an artist to paint something just for me.

That is, until my former co-worker Mischa McLachlan told me he was commissioning a piece from noted artist Chet Zar, whose dark, surreal paintings you may recognize.

"What?" I asked Mischa, "You mean you can just pay an artist money and they will do a custom painting for you?"

"I mean, you can always ask," he said.

You can imagine my jealousy when I later went over to Mischa's apartment and saw these gorgeous paintings hanging on the wall.

(You may notice a resemblance.)

I had no immediate need for a painting, but the seed had been planted.

Fast-forward a few years, to a couple months before Tilde's one year anniversary. I had been wanting to get a gift for my other co-founders, but, as a bootstrapped startup, the traditional gift of a gold Rolex was slightly out of the budget.

While sitting at home and enjoying a glass of wine by myself ("Never spend more than $6 on a bottle of wine"ā€”best advice my mom ever gave me), I decided to check what was new on Hacker News.

Amid the seventeen new minimal JavaScript frameworks launched that day, something caught my eye: Old School Color Cycling with HTML5.

The static screenshot doesn't do this justice. Click the link!

I love artwork like this, and I loved that this clever, old-school hack for adding animation to games had been ported to the web for a new generation to enjoy.

Inhibitions lowered by the wine, I decided to fire off an email to the artist, Mark Ferrari, and ask him if he'd be up for something a littleā€¦ different.

Mark was gracious enough to reply the next day, but I was disappointed to find out that he had retired from the art world and was focusing his efforts on writing novels.

Not to be deterred, we discovered that I'd soon be traveling through Seattle, where he lived, and we agreed to meet and at least discuss the project.

Turns out, we hit it off. I told him about how we started Tilde to fight back against the tide of insane venture capital money flowing through San Francisco, about open source, and how we wanted to control our own destinies, free of investor interference.

Mark regaled me with stories of the early videogame world, and how the golden era of fun and experimentation was driven out by the influx of big money.

Somehow, Mark agreed to come out of "retirement" to make my plan a reality: a company portrait, set in a lush fantasy universe.

Mark needed reference photos of everyone, but I wasn't sure how to get them without ruining the surprise. A couple weeks later, I had everyone come into the office on a weekend, but I didn't let on what was about to happen.

I wish you could see the look of skepticism on the faces of everyone when they walked into a room filled with armor, robes, swords, bows and the other accoutrements of life in the fantasy realm.

"If you're trying to trick us into LARPing with you, I'm leaving," someone threatened.

Mark (who I referred to as "Mr. X" so no one could Google him and find out his occupation) dressed, positioned, and photographed everyone, then bid us adieu.

Several weeks later, the final product arrived in my inbox. It was gorgeous, and everyone was completely surprised. We had a physical print made from the digital copy that now hangs proudly in the event space in our office.

I won't keep you waiting any longer. Here's the final product:

click for hi-res

My favorite thing about the piece is the story it tells, which was all Mark's idea. The monster, in the background, represents the voracious appetite of venture capital, leaving so many destroyed companies and burnt-out founders in its wake.

In the foreground is the intrepid band of explorers, going out into the dark on their own. Importantly, you can't see the forest; it's as unknown to you as it is to the protagonists. It's up to your imagination what dangers lurk.

Mark did an amazing job with the details, as well. If you have the time, take a second to appreciate the fine detail, which was all drawn by hand in Photoshop.

I particularly love the binary glow of the insects, and the circuit board of Carl's staff.

Before I got started with this crazy plan, the world of art seemed totally foreign to me. I don't have any skills to speak of, and I thought it was something I would never be a part of.

Now I know that there are tons of talented artists out there, and they rely on commissions to help put food on the table.

If you're looking for a truly unique gift, or a way to spruce up your home or offices, just try emailing the artist next time you see something you love. Odds are, they'll be happy to help you bring your idea to life.

And if you happen to like this drawing, well, I'm sure Mark would be happy to help you create something of your own.

Please help me pay for more cool paintings like this by signing up for a free, 30-day trial of Skylight.

12 Sep 13:03

Learning from Apple's livestream perf fiasco

by igrigorik
Dmitry Krasnoukhov

perf.fail :(

Appleā€™s Sept 9th livestream got a lot of press, both good (great products), andĀ bad:Ā broken livestream, site downtime, and so on. Weā€™ll leave the products to the press, but letā€™s dissect (some) of the perf problemsā€¦

image

Running the livestream pageĀ through WPTĀ shows that even without the video stream the landing page weighs in atĀ 5,790KB - not completely outrageousĀ (weā€™ve all seen worse), but pretty hefty. Letā€™s break it down.

image

348 image requests accounting for ~3.7MB! Thatā€™s a lot of image requests. Digging deeper, most of them appear to come from the curated feed of tweets and images. More specifically the page fetchesĀ feed.jsonĀ fromĀ www.apple.comĀ origin, which provides the text and image URLs. At this point, we arrive at our first perf fail:

  • The JSON file isĀ 531.7KB, and its served uncompressed! Applying gzip to the file would reduce it to 57KB - thatā€™s 90% savings.
  • To make matters worse, the maxage is set to <10 seconds! On one hand, this is understandable, you want to expire old content and get new updates to your users. But, this also significantly reduces the effectiveness of edge caching - apple.com is served via Akamai.

Moving on. The fetched JSON file contains 290+ entries, most of which contain some text and a set of image references documenting highlights from the keynote. The site immediately dispatches all of the requestsā€¦ at once. No on demand loading, just give me all of it, please, and now.

But, 290 entries and 348 image requests? It still doesnā€™t add up. Turns out, the site is fetching not one, but two image assets: a ā€œloresā€ preview and a full replacement image. Also, the site also appears to adapt to screen DPR, which indicates that the ~5MB total page weight indicated by WPT is not the highest amount either, since we could have downloaded higher res photos. Yes, Iā€™m looking at you, you lucky ā€œRetina screenā€ users - you got the 10MB+ experience! Those image bytes sure add up quickly.

image

To be fair, the ā€œloresā€ previews are scaled down and well compressed. That said, something tells me most of them were useless since the visitor never sees the majority of them - even if you tried, you canā€™t scroll fast enough to see all the previews. Hi,Ā Jank. That said, now things take a turn for the worse:Ā the max-age on these image assets is also <10s.Ā 

Why? No idea. All of them are served viaĀ images.apple.com, which is also fronted by Akamai, but once again, a short TTL really doesnā€™t help with caching, which meansĀ there wereĀ a lot of requests hitting the Apple origin servers. Those poor Apache servers powering Appleā€™s site must have been workingĀ really, reallyĀ hard. Iā€™m not surprised the site was experiencing intermittent outages.

Oh, and speaking of load on origin serversā€¦ Remember feed.json? Every 10 seconds the page makes a polling request to the server to fetch the latest version. Combine that with a really short maxage TTL and missing gzip compression, and youā€™ve just created a self-inflicted DDoS.

So, lessons learned? Well, thereā€™s a bunch:

  1. Compress your JSON. No really, gzip helps.
  2. On demand image loading strategy would have significantly reduced the number of requests for latecomers to the stream.
  3. Your images are likely static, cache them with a long TTL!
  4. The ā€œloresā€ previews seems to have done more harm than good.
  5. Providing a push feed instead of polling, or a ā€œlist of recent updates since X timestampā€ functionality would have significantly reduced the amount of data in feed.json

Perf fail happens to the best of us.

P.S. The actual media stream was deliveredĀ via aĀ different originĀ (also via Akamai). Did the above site perf problems affect the streaming? Maybe, hard to say.Ā Perf fail is often additive in bizarre and interesting ways.

22 Jul 11:36

Queues and databases

Dmitry Krasnoukhov

Queues are hard

Queues are an incredibly useful tool in modern computing, they are often used in order to perform some possibly slow computation at a latter time in web applications. Basically queues allow to split a computation in two times, the time the computation is scheduled, and the time the computation is executed. A ā€œproducerā€, will put a task to be executed into a queue, and a ā€œconsumerā€ or ā€œworkerā€ will get tasks from the queue to execute them. For example once a new user completes the registration process in a web application, the web application will add a new task to the queue in order to send an email with the activation link. The actual process of sending an email, that may require retrying if there are transient network failures or other errors, is up to the worker.

Technically speaking we can think at queues as a form of inter-process messaging primitive, where the receiving process needs to acknowledge the reception of the message. Messages can not be fire-and-forget, since the queue needs to understand if the message can be removed from the queue, so some form of acknowledgement is strictly required.

When receiving a message triggers the execution of a task, like it happens in the kind of queues we are talking about, the moment the message reception is acknowledged changes the semantic of the queue. When the worker process acknowledges the reception of the message *before* processing the message, if the worker fails the message can be lost before the task is performed at all. If the acknowledge is sent only *after* the message gets processed, if the worker fails or because of network partitions the queue may re-deliver the message again. This happens whatever the queue consistency properties are, so, even if the queue is modeled using a system providing strong consistency, the indetermination still holds true:

* If messages are acknowledged before processing, the queue will have an at-most-once delivery property. This means messages can be processed zero or one time.
* If messages are acknowledged after processing, the queue will have an at-least-once delivery property. This means messages can be processed from 1 to infinite number of times.

While both of this cases are not perfect, in the real world the second behavior is often preferred, since it is usually much simpler to cope with the case of multiple delivery of the message (triggering multiple executions of the task) than a system that from time to time does not execute a given task at all. An example of at-least-once delivery system is Amazon SQS (Simple Queue Service).

There is also a fundamental reason why at-least-once delivery systems are to be preferred, that has to do with distributed systems: the other semantics (at-most-once delivery) requires the queue to be strongly consistent: once the message is acknowledged no other worker must be able to acknowledge the same message, which is a strong property.

Once we move our focus to at-least-once delivery systems, we may notice that to model the queue with a CP system is a waste, and also a disadvantage:

* Anyway, we canā€™t guarantee more than at-least-once delivery.
* Our queue lose the ability to work into a minority side of a network partition.
* Because of the consistency requirements the queue needs agreement, so we are burning performances and adding latency without any good reason.

Since messages may be delivered multiple times, what we want conceptually is a commutative data structure and an eventually consistent system. Messages can be stored into a set data structure replicated into N nodes, with the merge function being the union among the sets. Acknowledges, received by workers after execution of messages, are also conceptually elements of the set, marking a given element as processed. This is a trivial example which is not particularly practical for a real world system, but shows how a given kind of queue is well modeled by a given set of properties of a distributed system.

Practically speaking there are other useful things our queue may try to provide:

* Guaranteed delivery to a single worker at least for a window of time: while multiple delivery is allowed, we want to avoid it as much as possible.
* Best-effort checks to avoid to re-delivery a message after a timeout if the message was already processed. Again, we canā€™t guarantee this property, but we may try hard to reduce re-issuing a message which was actually already processed.
* Enough internal state to handle, during normal operations, messages as a FIFO, so that messages arriving first are processed first.
* Auto cleanup of the internal data structures.

On top of this we need to retain messages during network partitions, so that conceptually (even if practically we could use different data structures) the set of messages to deliver are the union of all the messages of all the nodes.

Unfortunately while many Redis based queues implementations exist, no one try to use N Redis independent nodes and the offered primitives as a building block for a distributed system with such characteristics. Using Redis data structures and performances, and algorithms providing certain useful guarantees, may provide a queue system which is very practical to use, easy to administer and scale, while providing excellent performances (messages / second) per node.

Because I find the topic is interesting and this is an excellent use case for Redis, Iā€™m very slowly working at a design for such a Redis based queue system. I hope to show something during the next weeks, time permitting. Comments
07 Jul 16:05

10 Tricks to Appear Smart During Meetings

Dmitry Krasnoukhov

Because meetings are great alternative to work

Like everyone, appearing smart during meetings is my top priority. Sometimes this can be difficult if you start daydreaming about your next vacation, your next nap, or bacon. When this happens, itā€™s good to have some fallback tricks to fall back on. Here are my ten favorite tricks for quickly appearing smart during meetings.

1. Draw a Venn diagram

Getting up and drawing a Venn diagram is a great way to appear smart. It doesnā€™t matter if your Venn diagram is wildly inaccurate, in fact, the more inaccurate the better. Even before youā€™ve put that marker down, your colleagues will begin fighting about what exactly the labels should be and how big the circles should be, etc. At this point, you can slink back to your chair and go back to playing Candy Crush on your phone.

2. Translate percentage metrics into fractions

If someone says ā€œAbout 25% of all users click on this button,ā€ quickly chime in with, ā€œSo about 1 in 4,ā€ and make a note of it. Everyone will nod their head in agreement, secretly impressed and envious of your quick math skills.

3. Encourage everyone to ā€œtake a step backā€

There comes a point in most meetings where everyone is chiming in, except you. Opinions and data and milestones are being thrown around and you donā€™t know your CTA from your OTA. This is a great point to go, ā€œGuys, guys, guys, can we take a step back here?ā€ Everyone will turn their heads toward you, amazed at your ability to silence the fray. Follow it up with a quick, ā€œWhat problem are we really trying to solve?ā€ and, boom! Youā€™ve bought yourself another hour of looking smart.

4. Nod continuously while pretending to take notes

Always bring a notepad with you. Your rejection of technology will be revered. Take notes by simply writing down one word from every sentence that you hear. Nod continuously while doing so. If someone asks you if youā€™re taking notes, quickly say that these are your own personal notes and that someone else should really be keeping a record of the meeting. Bravo compadre. Youā€™ve saved your ass, and youā€™ve gotten out of doing any extra work. Or any work at all, if youā€™re truly succeeding.

5. Repeat the last thing the engineer said, but very very slowly

Make a mental note of the engineer in the room. Remember his name. Heā€™ll be quiet throughout most of the meeting, but when his moment comes everything out of his mouth will spring from a place of unknowable brilliance. After he utters these divine words, chime in with, ā€œLet me just repeat that,ā€ and repeat exactly what he just said, but very, very slowly. Now, his brilliance has been transferred to you. People will look back on the meeting and mistakenly attribute the intelligent statement to you.

6. Ask ā€œWill this scale?ā€ no matter what it is

Itā€™s important to find out if things will scale no matter what it is youā€™re discussing. No one even really knows what that means, but itā€™s a good catch-all question that generally applies and drives engineers nuts.

7. Pace around the room

Whenever someone gets up from the table and walks around, donā€™t you immediately respect them? I know I do. It takes a lot of guts but once you do it, you immediately appear smart. Fold your arms. Walk around. Go to the corner and lean against the wall. Take a deep, contemplative sigh. Trust me, everyone will be shitting their pants wondering what youā€™re thinking. If only they knew (bacon).

8. Ask the presenter to go back a slide

ā€œSorry, could you go back a slide?ā€ Theyā€™re the seven words no presenter wants to hear. It doesnā€™t matter where in the presentation you shout this out, itā€™ll immediately make you look like youā€™re paying closer attention than everyone else is, because clearly they missed the thing that youā€™re about to brilliantly point out. Donā€™t have anything to point out? Just say something like, ā€œIā€™m not sure what these numbers mean,ā€ and sit back. Youā€™ve bought yourself almost an entire meeting of appearing smart.

9. Step out for a phone call

Youā€™re probably afraid to step out of the room because you fear people will think you arenā€™t making the meeting a priority. Interestingly, however, if you step out of a meeting for an ā€œimportantā€ phone call, theyā€™ll all realize just how busy and important you are. Theyā€™ll say, ā€œWow, this meeting is important, so if he has something even more important than this, well, we better not bother him.ā€

10. Make fun of yourself

If someone asks what you think, and you honestly didnā€™t hear a single word anyone said for the last hour, just say, ā€œI honestly didnā€™t hear a single word anyone said for the last hour.ā€ People love self-deprecating humor. Say things like, ā€œMaybe we can just use the lawyers from my divorce,ā€ or ā€œGod I wish I was dead.ā€ Theyā€™ll laugh, value your honesty, consider contacting H.R., but most importantly, think youā€™re the smartest person in the room.

03 Jul 11:55

Is coding the new literacy?

Dmitry Krasnoukhov

Might take a long time to read, but believe me it's worth it

In the winter of 2011, a handful of software engineers landed in Boston just ahead of a crippling snowstorm. They were there as part of Code for America, a program that places idealistic young coders and designers in city halls across the country for a year. They'd planned to spend it building a new website for Boston's public schools, but within days of their arrival, the city all but shut down and the coders were stuck fielding calls in the city's snow emergency center.

In such snowstorms, firefighters can waste precious minutes finding and digging out hydrants. A city employee told the CFA team that the planning department had a list of street addresses for Boston's 13,000 hydrants. "We figured, 'Surely someone on the block with a shovel would volunteer if they knew where to look,'" says Erik Michaels-Ober, one of the CFA coders. So they got out their laptops.

Screenshot from Adopt-a-Hydrant Code for America

Now, Boston has adoptahydrant.org, a simple website that lets residents "adopt" hydrants across the city. The site displays a map of little hydrant icons. Green ones have been claimed by someone willing to dig them out after a storm, red ones are still availableā€”500 hydrants were adopted last winter.

Maybe that doesn't seem like a lot, but consider what the city pays to keep it running: $9 a month in hosting costs. "I figured that even if it only led to a few fire hydrants being shoveled out, that could be the difference between life or death in a fire, so it was worth doing," Michaels-Ober says. And because the CFA team open-sourced the code, meaning they made it freely available for anyone to copy and modify, other cities can adapt it for practically pennies. It has been deployed in Providence, Anchorage, and Chicago. A Honolulu city employee heard about Adopt-a-Hydrant after cutbacks slashed his budget, and now Honolulu has Adopt-a-Siren, where volunteers can sign up to check for dead batteries in tsunami sirens across the city. In Oakland, it's Adopt-a-Drain.

Sounds great, right? These simple software solutions could save lives, and they were cheap and quick to build. Unfortunately, most cities will never get a CFA team, and most can't afford to keep a stable of sophisticated programmers in their employ, either. For that matter, neither can many software companies in Silicon Valley; the talent wars have gotten so bad that even brand-name tech firms have been forced to offer employees a bonus of upwards of $10,000 if they help recruit an engineer.

In fact, even as the Department of Labor predicts the nation will add 1.2 million new computer-science-related jobs by 2022, we're graduating proportionately fewer computer science majors than we did in the 1980s, and the number of students signing up for Advanced Placement computer science has flatlined.

There's a whole host of complicated reasons why, from boring curricula to a lack of qualified teachers to the fact that in most states computer science doesn't count toward graduation requirements. But should we worry? After all, anyone can learn to code after taking a few fun, interactive lessons at sites like Codecademy, as a flurry of articles in everything from TechCrunch to Slate have claimed. (Michael Bloomberg pledged to enroll at Codecademy in 2012.) Twelve million people have watched a video from Code.org in which celebrities like NBA All-Star Chris Bosh and will.i.am pledged to spend an hour learning code, a notion endorsed by President Obama, who urged the nation: "Don't just play on your phoneā€”program it."

So you might be forgiven for thinking that learning code is a short, breezy ride to a lush startup job with a foosball table and free kombucha, especially given all the hype about billion-dollar companies launched by self-taught wunderkinds (with nary a mention of the private tutors and coding camps that helped some of them get there). The truth is, codeā€”if what we're talking about is the chops you'd need to qualify for a programmer jobā€”is hard, and lots of people would find those jobs tedious and boring.

But let's back up a step: What if learning to code weren't actually the most important thing? It turns out that rather than increasing the number of kids who can crank out thousands of lines of JavaScript, we first need to boost the number who understand what code can do. As the cities that have hosted Code for America teams will tell you, the greatest contribution the young programmers bring isn't the software they write. It's the way they think. It's a principle called "computational thinking," and knowing all of the Java syntax in the world won't help if you can't think of good ways to apply it.

Unfortunately, the way computer science is currently taught in high school tends to throw students into the programming deep end, reinforcing the notion that code is just for coders, not artists or doctors or librarians. But there is good news: Researchers have been experimenting with new ways of teaching computer science, with intriguing results. For one thing, they've seen that leading with computational thinking instead of code itself, and helping students imagine how being computer savvy could help them in any career, boosts the number of girls and kids of color takingā€”and sticking withā€”computer science. Upending our notions of what it means to interface with computers could help democratize the biggest engine of wealth since the Industrial Revolution.
Ā 

So what is computational thinking? If you've ever improvised dinner, pat yourself on the back: You've engaged in some light CT.

There are those who open the pantry to find a dusty bag of legumes and some sad-looking onions and think, "Lentil soup!" and those who think, "Chinese takeout." A practiced home cook can mentally sketch the path from raw ingredients to a hot meal, imagining how to substitute, divide, merge, apply external processes (heat, stirring), and so on until she achieves her end. Where the rest of us see a dead end, she sees the potential for something new.

If seeing the culinary potential in raw ingredients is like computational thinking, you might think of a software algorithm as a kind of recipe: a step-by-step guide on how to take a bunch of random ingredients and start layering them together in certain quantities, for certain amounts of time, until they produce the outcome you had in mind.

Like a good algorithm, a good recipe follows some basic principles. Ingredients are listed first, so you can collect them before you start, and there's some logic in the way they are listed: olive oil before cumin because it goes in the pan first. Steps are presented in order, not a random jumble, with staggered tasks so that you're chopping veggies while waiting for water to boil. A good recipe spells out precisely what size of dice or temperature you're aiming for. It tells you to look for signs that things are working correctly at each stageā€”the custard should coat the back of a spoon. Opportunities for customization are markedā€”use twice the milk for a creamier textureā€”but if any ingredients are absolutely crucial, the recipe makes sure you know it. If you need to do something over and overā€”add four eggs, one at a time, beating after eachā€”those tasks are boiled down to one simple instruction.

Much like cooking, computational thinking begins with a feat of imagination, the ability to envision how digitized informationā€”ticket sales, customer addresses, the temperature in your fridge, the sequence of events to start a car engine, anything that can be sorted, counted, or trackedā€”could be combined and changed into something new by applying various computational techniques. From there, it's all about "decomposing" big tasks into a logical series of smaller steps, just like a recipe.

Those techniques include a lot of testing along the way to make sure things are working. The culinary principle of mise en place is akin to the computational principle of sorting: organize your data first, and you'll cut down on search time later. Abstraction is like the concept of "mother sauces" in French cooking (bƩchamel, tomato, hollandaise), building blocks to develop and reuse in hundreds of dishes. There's iteration: running a process over and over until you get a desired result. The principle of parallel processing makes use of all available downtime (think: making the salad while the roast is cooking). Like a good recipe, good software is really clear about what you can tweak and what you can't. It's explicit. Computers don't get nuance; they need everything spelled out for them.

Put another way: Not every cook is a David Chang, not every writer is a Jane Austen, and not every computational thinker is a Guido van Rossum, the inventor of the influential Python programming language. But just as knowing how to scramble an egg or write an email makes life easier, so too will a grasp of computational thinking. Yet the "learn to code!" camp may have set people on the uphill path of mastering C++ syntax instead of encouraging all of us to think a little more computationally.

The happy truth is, if you get the fundamentals about how computers think, and how humans can talk to them in a language the machines understand, you can imagine a project that a computer could do, and discuss it in a way that will make sense to an actual programmer. Because as programmers will tell you, the building part is often not the hardest part: It's figuring out what to build. "Unless you can think about the ways computers can solve problems, you can't even know how to ask the questions that need to be answered," says Annette Vee, a University of Pittsburgh professor who studies the spread of computer science literacy.

Indeed, some powerful computational solutions take just a few lines of codeā€”or no code at all. Consider this lo-fi example: In 1854, a London physician named John Snow helped squelch a cholera outbreak that had killed 616 residents. Brushing aside the prevailing theory of the diseaseā€”deadly miasmaā€”he surveyed relatives of the dead about their daily routines. A map he made connected the disease to drinking habits: tall stacks of black lines, each representing a death, grew around a water pump on Broad Street in Soho that happened to be near a leaking cesspool. His theory: The disease was in the water. Classic principles of computational thinking came into play here, including merging two datasets to reveal something new (locations of deaths plus locations of water pumps), running the same process over and over and testing the results, and pattern recognition. The pump was closed, and the outbreak subsided.

Or take Adopt-a-Hydrant. Under the hood, it isn't a terribly sophisticated piece of software. What's ingenious is simply that someone knew enough to say: Here's a database of hydrant locations, here is a universe of people willing to help, let's match them up. The computational approach is rooted in seeing the world as a series of puzzles, ones you can break down into smaller chunks and solve bit by bit through logic and deductive reasoning. That's why Jeannette Wing, a VP of research at Microsoft who popularized the term "computational thinking," says it's a shame to think CT is just for programmers. "Computational thinking involves solving problems, designing systems, and understanding human behavior," she writes in a publication of the Association for Computing Machinery. Those are handy skills for everybody, not just computer scientists.

In other words, computational thinking opens doors. For while it may seem premature to claim that today every kid needs to code, it's clear that they're increasingly surrounded by opportunities to codeā€”opportunities that the children of the privileged are already seizing. The parents of Facebook founder Mark Zuckerberg got him a private computer tutor when he was in middle school. Last year, 13,000 people chipped in more than $600,000 via Kickstarter for their own limited-edition copy of Robot Turtles, a board game that teaches programming basics to kids as young as three. There are plenty of free, kid-oriented code-learning sitesā€”like Scratch, a programming language for children developed at MITā€”but parents and kids in places like San Francisco or Austin are more likely to know they exist.

Computer scientists have been warning for decades that understanding code will one day be as essential as reading and writing. If they're right, understanding the importance of computational thinking can't be limited to the elite, not if we want some semblance of a democratic society. Self-taught auteurs will always be part of the equation, but to produce tech-savvy citizens "at scale," to borrow an industry term, the heavy lifting will happen in public school classrooms. Increasingly, to have a good shot at a good job, you'll need to be code literate.

Upending our notions of what it means to interface with computers could help democratize the biggest engine of wealth since the Industrial Revolution.

"Code literate." Sounds nice, but what does it mean? And where does literacy end and fluency begin? The best way to think about that is to look to the history of literacy itself.

Reading and writing have become what researchers have called "interiorized" or "infrastructural," a technology baked so deeply into everyday human life that we're never surprised to encounter it. It's the main medium through which we connect, via not only books and papers, but text messages and the voting booth, medical forms and shopping sites. If a child makes it to adulthood without being able to read or write, we call that a societal failure.

Yet for thousands of years writing was the preserve of the professional scribes employed by the elite. So what moved it to the masses? In Europe at least, writes literacy researcher Vee, the tipping point was the Domesday Book, an 11th-century survey of landowners that's been called the oldest public record in England.

A page from the Domesday Book National Archives, UK

Commissioned by William the Conqueror to take stock of what his new subjects held in terms of acreage, tenants, and livestock so as to better tax them, royal scribes fanned across the countryside taking detailed notes during in-person interviews. It was like a hands-on demo on the efficiencies of writing, and it proved contagious. Despite skepticismā€”writing was hard, and maybe involved black magicā€”other institutions started putting it to use. Landowners and vendors required patrons and clients to sign deeds and receipts, with an "X" if nothing else. Written records became admissible in court. Especially once Johannes Gutenberg invented the printing press, writing seeped into more and more aspects of life, no longer a rarefied skill restricted to a cloistered class of aloof scribes but a function of everyday society.

Fast forward to 19th-century America, and it'd be impossible to walk down a street without being bombarded with written information, from newspapers to street signs to store displays; in the homes of everyday people, personal letters and account ledgers could be found. "The technology of writing became infrastructural," Vee writes in her paper "Understanding Computer Programming As a Literacy." "Those who could not read text began to be recast as 'illiterate' and power began to shift towards those who could." Teaching children how to read became a civic and moral imperative. Literacy rates soared over the next century, fostered through religious campaigns, the nascent public school system, and the at-home labor of many mothers.

Of course, not everyone was invited in immediately: Illiteracy among women, slaves, and people of color was often outright encouraged, sometimes even legally mandated. But today, while only some consider themselves to be "writers," practically everybody reads and writes every day. It's hard to imagine there was ever widespread resistance to universal literacy.

So how does the history of computing compare? Once again, says Vee, it starts with a census. In 1880, a Census Bureau statistician, Herman Hollerith, saw that the system of collecting and sorting surveys by hand was buckling under the weight of a growing population. He devised an electric tabulating machine, and generations of these "Hollerith machines" were used by the bureau until the 1950s, when the first commercial mainframe, the UNIVAC, was developed with a government research grant. "The first successful civilian computer," it was a revolution in computing technology: Unlike the "dumb" Hollerith machine and its cousins, which ran on punch cards, vacuum tubes, and other mechanical inputs that had to be manually entered over and over again, the UNIVAC had memory. It could store instructions, in the form of programs, and remember earlier calculations for later use.1

1 The evolution of communication technologies has always been an issue of memory. For thousands of years, the oral tradition had enough storage space to house the expanse of human records and information. As communities got bigger, oral tradition started maxing out. So a new technology sprang up, one that could distill thought into a series of symbolic scratches that could be packaged up, transported, and recompiled by the user into language and thought. But while books have immensely greater RAM than a song poem, a computer offers exponentially more capacity than either of these.

Once the UNIVAC was unveiled, research institutions and the private sector began clamoring for mainframes of their own. The scribes of the computer age, the early programmers who had worked on the first large-scale computing projects for the government during the war, took jobs at places like Bell Labs, the airline industry, banks, and research universities. "The spread of the texts from the central government to the provinces is echoed in the way that the programmers who cut their teeth on major government-funded software projects then circulated out into smaller industries, disseminating their knowledge of code writing further," Vee writes. Just as England had gone from oral tradition to written record after the Domesday Book, the United States in the 1960s and '70s shifted from written to computational record.

The 1980s made computers personal, and today it's impossible not to engage in conversations powered by code, albeit code that's hidden beneath the interfaces of our devices. But therein lies a new problem: The easy interface creates confusion around what it means to be "computer literate." Interacting with an app is very different from making or tweaking or understanding one, and opportunities to do the latter remain the province of a specialized elite. In many ways, we're still in the "scribal stage" of the computer age.

But the tricky thing about literacy, Vee says, is that it begets more literacy. It happened with writing: At first, laypeople could get by signing their names with an "X." But the more people used reading and writing, the more was required of them.

We can already see code leaking into seemingly far-removed fields. Hospital specialists collect data from the heartbeat monitors of day-old infants, and run algorithms to spot babies likely to have respiratory failure. Netflix is a gigantic experiment in statistical machine learning. Legislators are being challenged to understand encryption and relational databases during hearings on the NSA.

The most exciting advances in most scientific and technical fields already involve big datasets, powerful algorithms, and people who know how to work with both. But that's increasingly true in almost any profession. English literature and computer science researchers fed Agatha Christie's oeuvre into a computer, ran a textual-analysis program, and discovered that her vocabulary shrank significantly in her final books. They drew from the work of brain researchers and put forth a new hypothesis: Christie suffered from Alzheimer's. "More and more, no matter what you're interested in, being computationally savvy will allow you to do a better job," says Jan Cuny, a leading CS researcher at the National Science Foundation (NSF).

Grace Hopper
Grace Hopper led the team that developed the UNIVAC, the first commercial computer. Smithsonian Institution

It may be hard to swallow the idea that coding could ever be an everyday activity on par with reading and writing in part because it looks so foreign (what's with all the semicolons and carets)? But remember that it took hundreds of years to settle on the writing conventions we take for granted today: Early spellings of wordsā€”Whan that Aprille with his shoures sooteā€”can seem as foreign to modern readers as today's code snippets do to nonprogrammers. Compared to the thousands of years writing has had to go from notched sticks to glossy magazines, digital technology has, in 60 years, evolved exponentially faster.

Our elementary-school language arts teachers didn't drill the alphabet into our brains anticipating Facebook or WhatsApp or any of the new ways we now interact with written material. Similarly, exposing today's third-graders to a dose of code may mean that at 30 they retain enough to ask the right questions of a programmer, working in a language they've never seen on a project they could never have imagined.

To produce tech-savvy citizens "at scale," to borrow an industry term, the heavy lifting will happen in public school classrooms

One day last year, Neil Fraser, a young software engineer at Google, showed up unannounced at a primary school in the coastal Vietnamese city of Da Nang. Did the school have computer classes, he wanted to know, and could he sit in? A school official glanced at Fraser's Google business card and led him into a classroom of fifth-graders paired up at PCs while a teacher looked on. What Fraser saw on their screens came as a bit of a shock.

Fraser, who was in Da Nang visiting his girlfriend's family, works in Google's education department in Mountain View, teaching JavaScript to new recruits. His interest in computer science education often takes him to high schools around the Bay Area, where he tells students that code is fun and interesting, and learning it can open doors after graduation.

The fifth-graders in Da Nang were doing exercises in Logo, a simple program developed at MIT in the 1970s to introduce children to programming. A turtle-shaped avatar blinked on their screens and the kids fed it simple commands in Logo's language, making it move around, leaving a colored trail behind. Stars, hexagons, and ovals bloomed on the monitors.

Logo turtle
Simple commands in Logo. MIT Media Lab

Fraser, who learned Logo when the program was briefly popular in American elementary schools, recognized the exercise. It was a lesson in loops, a bedrock programming concept in which you tell the machine to do the same thing over and over again, until you get a desired result. "A quick comparison with the United States is in order," Fraser wrote later in a blog post. At Galileo Academy, San Francisco's magnet school for science and technology, he'd found juniors in a computer science class struggling with the concept of loops. The fifth-graders in Da Nang had outpaced upperclassmen at one of the Bay Area's most tech-savvy high schools.

Another visit to an 11th-grade classroom in Ho Chi Minh City revealed students coding their way through a logic puzzle embedded in a digital maze. "After returning to the US, I asked a senior engineer how he'd rank this question on a Google interview," Fraser wrote. "Without knowing the source of the question, he judged that this would be in the top third."

Early code education isn't just happening in Vietnamese schools. Estonia, the birthplace of Skype, rolled out a countrywide programming-centric curriculum for students as young as six in 2012. In September, the United Kingdom will launch a mandatory computing syllabus for all students ages 5 to 16.

Meanwhile, even as US enrollment in almost all other STEM (science, technology, engineering, and math) fields has grown over the last 20 years, computer science has actually lost students, dropping from 25 percent of high school students earning credits in computer science to only 19 percent by 2009, according to the National Center for Education Statistics.

"Our kids are competing with kids from countries that have made computer science education a No. 1 priority," says Chris Stephenson, the former head of the Computer Science Teachers Association (CSTA). Unlike countries with federally mandated curricula, in the United States computer lesson plans can vary widely between states and even between schools in the same district. "It's almost like you have to go one school at a time," Stephenson says. In fact, currently only 20 states and Washington, DC, allow computer science to count toward core graduation requirements in math or science, and not one requires students to take a computer science course to graduate. Nor do the new Common Core standards, a push to make K-12 curricula more uniform across states, include computer science requirements.

It's no surprise, then, that the AP computer science course is among the College Board's least popular offerings; last year, almost four times more students tested in geography (114,000) than computer science (31,000). And most kids don't even get to make that choice; only 17 percent of US high schools that have advanced placement courses do so in CS. It was 20 percent in 2005.

For those who do take an AP computer science classā€”a yearlong course in Java, which is sort of like teaching cooking by showing how to assemble a KitchenAidā€”it won't count toward core graduation requirements in most states. What's more, many counselors see AP CS as a potential GPA ding, and urge students to load up on known quantities like AP English or US history. "High school kids are overloaded already," says Joanna Goode, a leading researcher at the University of Oregon's education department, and making time for courses that don't count toward anything is a hard sell.

In any case, it's hard to find anyone to teach these classes. Unlike fields such as English and chemistry, there isn't a standard path for aspiring CS teachers in grad school or continuing education programs. And thanks to wildly inconsistent certification rules between states, certified CS teachers can get stuck teaching math or library sciences if they move. Meanwhile, software whizzes often find the lure of the startup salary much stronger than the call of the classroom, and anyone who tires of Silicon Valley might find that its "move fast and break things" mantra doesn't transfer neatly to pedagogy.

And while many kids have mad skills in movie editing or Photoshopping, such talents can lull parents into thinking they're learning real computing. "We teach our kids how to be consumers of technology, not creators of technology," notes the NSF's Cuny.

Or, as Cory Doctorow, an editor of the technology-focused blog Boing Boing, put it in a manifesto titled "Why I Won't Buy an iPad": "Buying an iPad for your kids isn't a means of jump-starting the realization that the world is yours to take apart and reassemble; it's a way of telling your offspring that even changing the batteries is something you have to leave to the professionals."

But school administrators know that gleaming banks of shiny new machines go a long way in impressing parents and school boards. Last summer, the Los Angeles Unified School District set aside a billion dollars to buy an iPad for all 640,000 children in the district. To pay for the program, the district dipped into school construction bonds. Still, some parents and principals told the Los Angeles Times they were thrilled about it. "It gives us the sense of hope that these kids are being looked after," said one parent.2

2 The kids did quickly learn to hack their iPads, so there's some hope for actual inventiveness.

Sure, some schools are woefully behind on the hardware equation, but according to a 2010 federal study, only 3 percent of teachers nationwide lacked daily access to a computer in their classroom, and the nationwide ratio of students to school computers was a little more than 5-to-1. As to whether kids have computers at homeā€”that doesn't seem to make much difference in overall performance, either. A study from the National Bureau of Economic Research reviewed the grades, test scores, homework, and attendance of California 6th- to 10th-graders who were randomly given computers to use at home for the first time. A year later, the study found, nothing happened. Test scores, grades, disciplinary actions, time spent on homework: None of it went up or downā€”except the kids did log a lot more time playing games.

We're still in the "scribal stage" of the computer age, where skills are in the hands of an elite.

One sunny morning last summer, 40 Los Angeles teachers sat in a warm classroom at UCLA playing with crayons, flash cards, and Legos. They were students again for a week, at a workshop on how to teach computer science. Which meant that first they had to learn computer science.

The lesson was in binary numbers, or how to write any number using just two digits. "Computers can only talk in ones and zeros," explained the instructor, a fellow teacher who'd taken the same course. The course is funded by the National Science Foundation, and so is the experimental new blueprint it trains teachers to use, called Exploring Computer Science (ECS). "You gotta talk to them in their language."

Made sense at first, but when it came to turning the number 1,250 into binary, the class started falling apart. At one table, two female teachers politely endured a long, wrong explanation from an older male colleague. A teacher behind them mumbled, "I don't get it," pushed his flash cards away, and counted the minutes to lunchtime. A table of guys in their 30s was loudly sprinting toward an answer, and a minute later the bearded white guy at the head of their table, i.e., the one most resembling a classic programmer, shot his hand up with the answer and an explanation of how he got there: "Basically what you do is, you just turn it into an algorithm." Blank stares suggested few colleagues knew what an algorithm wasā€”in this case a simple, step-by-step process for turning a number into binary. (The answer, if you're curious, is 010011100010.)

This lessonā€”which by the end of the day clicked for most in the classā€”might seem like most people's image of CS, but the course these teachers are learning to teach couldn't look more different from classic AP computer science. Much of what's taught in ECS is about the why of computer science, not just the how. There are discussions and writing assignments on everything from personal privacy in the age of Big Data to the ethics of robot labor to how data analysis could help curb problems like school bullying. Instead of rote Java learning, it offers lots of logic games and puzzles that put the focus on computing, not computers. In fact, students hardly touch a computer for the first 12 weeks.

"Our curriculum doesn't lead with programming or code," says Jane Margolis, a senior researcher at UCLA who helped design the ECS curriculum and whose book Stuck in the Shallow End: Education, Race, and Computing provides much of the theory behind the lesson plans. "There are so many stereotypes associated with coding, and often it doesn't give the broader picture of what the field is about. The research shows you want to contextualize, show how computer science is relevant to their lives." ECS lessons ask students to imagine how they'd make use of various algorithms as a chef, or a carpenter, or a teacher, how they could analyze their own snack habits to eat better, and how their city council could use data to create cleaner, safer streets.

The ECS curriculum is now offered to 2,400 students at 31 Los Angeles public high schools and a smattering of schools in other cities, notably Chicago and Washington, DC. Before writing it, Margolis and fellow researchers spent three years visiting schools across the Los Angeles areaā€”overcrowded urban ones and plush suburban onesā€”to understand why few girls and students of color were taking computer science. At a tony school in West LA that the researchers dubbed "Canyon Charter High," they noticed students of color traveling long distances to get to school, meaning they couldn't stick around for techie extracurriculars or to simply hang out with like-minded students.

Black Girls Code

Equally daunting were the stereotypes. Take Janet, the sole black girl in Canyon's AP computer science class, who told the researchers she signed up for the course in part "because we [African American females] were so limited in the world, you know, and just being able to be in a class where I can represent who I am and my culture was really important to me." When she had a hard time keeping upā€”like most kids in the classā€”the teacher, a former software developer who, researchers noted, tended to let a few white boys monopolize her attention, pulled Janet aside and suggested she drop the class, explaining that when it comes to computational skills, you either "have it or don't have it."

Research shows that girls tend to pull away from STEM subjectsā€”including computer scienceā€”around middle school, while rates of boys in these classes stay steady. Fortunately, says Margolis, there's evidence that tweaking the way computer science is introduced can make a difference. A 2009 study tested various messages about computer science with college-bound teens. It found that explaining how programming skills can be used to "do good"ā€”connect with one's community, make a difference on big social problems like pollution and health careā€”reverberated strongly with girls. Far less successful were messages about getting a good job or being "in the driver's seat" of technological innovationā€”i.e., the dominant cultural narratives about why anyone would learn to code.

"For me, computer science can be used to implement social change," says Kim Merino, a self-described "social-justice-obsessed queer Latina nerd history teacher" who decided to take the ECS training a couple of years ago. Now, she teaches the class to middle and high schoolers at the UCLA Community School, an experimental new public K-12 school. "I saw this as a new frontier in the social-justice fight," she says. "I tell my students, 'I don't necessarily want to teach you how to get rich. I want to teach you to be a good citizen.'"

Merino's father was an aerospace engineer for Lockheed Martin. So you might think adapting to CS would be easy for her. Not quite. Most of the teachers she trained with were men. "Out of seven women, there were two of color. Honestly, I was so scared. But now, I take that to my classroom. At this point my class is half girls, mostly Latina and Korean, and they still come into my class all nervous and intimidated. My job is to get them past all of that, get them excited about all the things they could do in their lives with programming."

Merino has spent the last four years teaching kids of color growing up in inner cities to imagine what they could do with programmingā€”not as a replacement for, but as part of their dreams of growing up to be doctors or painters or social workers. But Merino's partner's gentle ribbings about how they'd ever start a family on a teacher's salary eventually became less gentle. She just took a job as director of professional development at CodeHS, an educational startup in San Francisco.

"We teach our kids how to be consumers of technology, not creators of technology."

It was a little more than a century ago that literacy became universal in Western Europe and the United States. If computational skills are on the same trajectory, how much are we hurting our economyā€”and our democracyā€”by not moving faster to make them universal?

There's the talent squeeze, for one thing. Going by the number of computer science majors graduating each year, we're producing less than half of the talent needed to fill the Labor Department's job projections. Women currently make up 20 percent of the software workforce, blacks and Latinos around 5 percent each. Getting more of them in the computing pipeline is simply good business sense.

It would also create a future for computing that more accurately reflects its past. A female mathematician named Ada Lovelace wrote the first algorithm ever intended to be executed on a machine in 1843. The term "programmer" was used during World War II to describe the women who worked on the world's first large-scale electronic computer, the ENIAC machine, which used calculus to come up with tables to improve artillery accuracy 3. In 1949, Rear Adm. Grace Hopper helped develop the UNIVAC, the first general-purpose computer, a.k.a. a mainframe, and in 1959 her work led to the development of COBOL, the first programming language written for commercial use.

Excluding huge swaths of the population also means prematurely killing off untold ideas and innovations that could make everyone's lives better. Because while the rash of meal delivery and dating apps designed by today's mostly young, male, urban programmers are no doubt useful, a broader base of talent might produce more for society than a frictionless Saturday night. 4

3 Six "ENIAC girls" did most of the programming, but until recently their work was all but forgotten. Male engineers worked on ENIAC's hardware, reflecting that until the 1950s, coding was considered clericalā€”even though it always involved higher math and applied logic. It was recast as a masculine pursuit as projects like Grace Hopper's UNIVAC demonstrated its promise.

And there's evidence that diverse teams produce better products. A study of 200,000 IT patents found that "patents invented by mixed-gender teams are cited [by other inventors] more often than patents invented by female-only or male-only" teams. The authors suggest one possibility for this finding may be "that gender diversity leads to more innovative research and discovery." (Similarly, research papers across the sciences that are coauthored by racially diverse teams are more likely to be cited by other researchers than those of all-white teams.)

Fortunately, there's evidence that girls exposed to very basic programming concepts early in life are more likely to major in computer science in college. That's why approaches like Margolis' ECS course, steeped in research on how to get and keep girls and other underrepresented minorities in computer science class, as well as groups like Black Girls Code, which offers affordable code boot camps to school-age girls in places like Detroit and Memphis, may prove appealing to the industry at large.

4 For example, Janet Emerson Bashen was the first black woman to receive a software patent, in 2006, for an app to better process Equal Employment Opportunity claims.

"Computer science innovation is changing our entire lives, from the professional to the personal, even our free time," Margolis says. "I want a whole diversity of people sitting at the design table, bringing different sensibilities and values and experiences to this innovation. Asking, 'Is this good for this world? Not good for the world? What are the implications going to be?'"

We make kids learn about biology, literature, history, and geometry with the promise that navigating the wider world will be easier for their efforts. It'll be harder and harder not to include computing on that list. Decisions made by a narrow demographic of technocrat elites are already shaping their lives, from privacy and social currency, to career choices and how they spend their free time.

Black Girls Code
Black Girls Code has introduced more than 1,500 girls to programming. Black Girls Code

Margolis' program and others like it are a good start toward spreading computational literacy, but they need a tremendous amount of help to scale up to the point where it's not such a notable loss when a teacher like Kim Merino leaves the profession. What's needed to make that happen is for people who may never learn a lick of code themselves to help shape the tech revolution the old-fashioned way, through educational reform and funding for schools and volunteer literacy crusades. Otherwise, we're all doomedā€”well, most of us, anywayā€”to be stuck in the Dark Ages.

Illustration by Charis Tsevis. Web production by Tasneem Raja.
03 Jul 09:52

RSS streaming

by Julien
Dmitry Krasnoukhov

I <3 realtime web

In the last year, we focused a lot on storing data from the feeds inside Superfeedr. We started by storing a lot of Google Reader content, using our Riak backend. When introducing our new PubSubHubbub endpoint, we had the opportunity to add things like subscribe and retrieve and later, params like before and after.

We also introduced a Jquery plugin for Superfeedr which made it extremely easy to add any RSS feed to any web page.

Streaming RSS

Today, weā€™re moving forward by adding HTTP streaming support to the RSS stored in Superfeedr. In English, this means, you can ask Superfeedr something like this:

Please, give me the last 5 items from that feed, but keep the connection open and give me any new item thatā€™s coming for as long as Iā€™m listening.

Translating in curl language that would be something like that:

curl "http://stream.superfeedr.com/?hub.mode=retrieve&wait=stream&hub.topic=http://push-pub.appspot.com/feed" 
-udemo:6f74cbf1c5d30fd0c668f2ac0592204c

Youā€™re more than welcome to try that in your shell.

Youā€™ll see that the connection is then hanging. You can easily update the feed by filling this form and you should see the new entry appear in your shell.

You can also get all this RSS/Atom converted to JSON by adding -H'Accept: application/json'.

Fanout

Of course building and maintaining an infrastructure to handle this kind of traffic and concurrent connections is far from trivial. In the same way that we would not write from scratch our very own database to store the content we process, it made sense to find a existing infrastructure and rely on their expertise to achieve that.

We picked Fanout because they provide a completely transparent approach by allowing us to use our very own CNAMEā€™s and proxy calls made to our API.

The first step is to setup a sub domain and point it to Fanoutā€™s servers. Fanout will proxy any call to our backend that it canā€™t handle. If your request to stream.superfeedr.com includes a wait=stream param, then, Fanout will proxy the request to Superfeedrā€™s main backend. We will serve the data to be returned to the client, as well as a GRIP. Fanout will serve the data, but keep the connection open.

Later, when the feed updates, we will notify Fanout and they will just serve the content to any existing connection, in a completely transparent way.

Long polling

One of the benefits of using Fanout is that they provide multiple options when building a Realtime API. HTTP streaming really works extremely well when used from a HTTP client, but browsers are not always great to deal with streams. In the browser, an option is to look at our wait=poll option, combined with the after parameter.

Basically, the first request will look like this:

curl -udemo:6f74cbf1c5d30fd0c668f2ac0592204c "https://stream.superfeedr.com?hub.mode=retrieve&wait=stream&hub.topic=http%3A%2F%2Fpush-pub.appspot.com%2Ffeed"

The response will come immediately with the current content of the feed. From there, you should extract the id element of the latest entry. At the time of writing this post, it is http://push-pub.appspot.com/feed/5637036128075776. We will re-use this element as the value for the after query parameter:

curl -udemo:6f74cbf1c5d30fd0c668f2ac0592204c "https://stream.superfeedr.com?format=json&hub.mode=retrieve&wait=poll&after=hhttp%3A%2F%2Fpush-pub.appspot.com%2Ffeed%2F5637036128075776&hub.topic=http%3A%2F%2Fpush-pub.appspot.com%2Ffeed"

If one (or more) new entry has been added during the small lag between the 2 queries, it will be served right away. However, in the more likely event that nothing was served, the connection will wait for a new item to be added to the feed. This technique will guarantee that no item is ever missed, even with a single concurrent HTTP request.

29 Jun 19:10

Photo

Dmitry Krasnoukhov

Manual щŠ°Ń Š±ŃƒŠ“ŠµŃ‚ Š“ŠµŠ»Š°Ń‚ŃŒ



29 Jun 14:09

BBC Maida Vale : ā€œYou Took Your Timeā€ feat. Jonwayne (live)

by mountkimbie

We invited Jonwayne to join us onĀ this live version ofĀ ā€œYou Took Your Timeā€ we did for BBC Radio 1ā€™s Residency (recorded at Maida Vale studios).
You can also head over to BBC Radio 1 to listen to the two others tracks we recorded that same day.

27 Jun 20:20

Photo



24 Jun 18:28

Introducing Lotus

A year and a half ago I felt frustrated by the state of the art of web development with Ruby. Secretly, in my spare time, I started hacking with new ideas, taking nothing for granted, destroying and starting from scratch several times, until the software was distilled in a beautiful API.

It took me a decade to get here, by following a process of subtraction of what isnā€™t essential. Countless refinements to achieve modularity, to balance elegance with performance, and convenience with solid design.

Each alternative was ponderated according to real world scenarios. Use cases that have been pain points or good choices in my and other developersā€™ experience.

But this project was sitting on my computer for too long.

For this reason, at the beginning of the year, I announced the project and a slow release schedule. Each month Iā€™ve released a library because I wanted to share with other developers the result of this effort, and create a discussion in the Ruby community. Now, six months and six frameworks later, Iā€™m proud to introduce the main element: Lotus.

Itā€™s a complete web framework, with a strong emphasis on object oriented design and testability. If you use Lotus, you employ less DSLs and more objects, zero monkey-patching, separation of concerns between MVC layers. Each library is designed to be small (under 500LOCs), fast and testable.

There is Lotus::Router which is an HTTP router and Lotus::Controller for controllers and actions. They both speak the Rack protocol, so they can be used in existing code base, or combined together for small API endpoint, or, again together, in a full stack Lotus app.

Lotus::View is the first library for Ruby that marks a separation between view objects and templates. While Lotus::Model, with repositories, data mapper and adapters helps to keep domain specific logic away from persistence.

We have infinite combinations. Small components have enormous advantages in terms of reusability.

The power of these frameworks is combined together in Lotus applications.

Microservices are at the core. Several independent applications can live together in the same Ruby process.

Lotus has a smart mechanism of framework duplication, where all the libraries can be employed several times. As the code base grows up it can be easily split in smaller deliverables.

Lotus has an extensive documentation, that covers all the supported architectures.

The future

Lotus is still a young framework, that needs to reach a certain degree of code maturity. Alongside with the vision that I have for the future features, it will improved by collecting feedbacks from real world applications.

Also, starting from today, Iā€™ll offer free consultancy hours to all the companies and individuals who are interested in Lotus.

{% include _lotusml.html %}

17 Jun 07:39

Every Bad Apology Your Tech Company Needs

Dmitry Krasnoukhov

OMG this is bookmarked post

Really, really sorry to everybody who backed us at the $75 level, but it seems like a typhoon destroyed the container ship containing our products, and also I havenā€™t seen my roommate in a few weeks and he was holding most of the money that we received. So weā€™re going to need a few more months to get everything together.
28 May 15:02

Additions to rules

by Yordan Yordanov
Dmitry Krasnoukhov

/regex\smy\sass/i


Why use rules?


Rules are one of the features that we are particularly proud of. Many users still don't know they even exist and the things they can do with them.

They let you create automated workflows that can relieve you from the task of manually filtering what you read.
For example you can set up a rule to automatically tag articles based on keywords in them. You can then just go to your tags and quickly see what's new in your favorite tags. Also those tagged articles will stay forever in your account, so you will be able to easily find them later.
Another example might be if you want to get notified by email for something you have big interest in. You can set up a rule to forward all matching articles to your email.

A rule is a three-part process:

  1. You set up on what scope it should work (All articles, a specific folder or feed).
  2. You define conditions, which should (all or at least one) be met, e.g. author name or title (not) containing specific word or phrase.
  3. You configure actions, which will be taken if conditions are met.
Those 3 simple steps provide you with a way to take full control over your flow of articles.
We will have follow up posts about the different use cases, but now it's time to introduce some additions to rules:

Regular expressions in conditions


Until now you could set up conditions like "is", "isn't", "contains", etc. but since today's update you are given the power to use regular expressions. Learning regular expressions can be fun too, if you don't know them well. Here are some good articles to start with:
If you need a very specific condition, regular expressions will come to the rescue.


Send to Pocket, Evernote, Instapaper and Readability in actions


Those 4 new actions will give you the ability to automatically send matching articles to those services. All of them are great in a different way, depending on what are you planning to use them for.
There is a limit of 150 articles per day per service. This limit is there to prevent flooding of your remote accounts. It is good for you, because what use do you have of such integration if we send thousands of articles in your accounts, will you ever read them?
It is also good to remember that your Evernote accounts have an upload limit (based on article size), so watch out for that. Some image intensive articles can eat through your limit very fast.

We hope with the new additions to rules, you will receive even more control of what you do an how you do it inside InoReader.

--
The Innologica team

and one happy aquatic creature ;)


23 May 16:53

Rails Is Not Dead

Dmitry Krasnoukhov

Please speak up

pA few years ago, my lead develop of the time, told me: emā€œBeware of the coupling that youā€™re introducing in your modelsā€/em. My answer was: emā€œHey, you donā€™t know what youā€™re talking about, this is ActiveRecordā€/em./p pI was an unexperienced developer fascinated by the innovation that Rails introduced. After all, my technology was able to get rid of all that configurations, and all that boring stuff from the past./p blockquote - Patterns?br / - I don't need them. I have the Holy Grail: MVC.br / - Interfaces?br / - Are you kidding me? This isn't Java, it's Ruby! /blockquote pWe all have been there. We were recovering from the consequences of the emNew Economy/em economy bubble burst. A new wave of technologies was raising: AJAX, Rails, and the Web 2.0. All was exciting, we felt like weā€™d learned from the past mistakes. We were creating a new way to sell software, we needed to be agile, fast and pragmatic./p pstrongAll that thinking about the architecture looked like an unnecessary aftermath of the enterprise legacy./strong/p pFast forwarding to today, after years spent on several projects, countless hours staring at a monitor, helping startups, banks, broadcasting and highway companies, non-profit organizations, large and small Open Source projects, I can confess one thing: strongI was wrong/strong./p pRails has the undeniable credit to have changed the way we do web development. It has lowered the entry level so much, that even people without a background as programmers have been able to create a successful business. But this had a cost: technical debt./p pI donā€™t blame Rails for this, as we shouldnā€™t blame TDD for writing emdamaged/em code. We write it, we generate legacy a href=https://vimeo.com/1752667while writing it/a. We take merits when itā€™s good, weā€™re to blame when it isnā€™t./p pBut honestly, I donā€™t get one thing: why half of the Ruby community is so scared to talk about better code?/p pI have my own opinions about software, that now are diverging from Rails. Iā€™m creating a a href=http://lotusrb.orgweb framework/a to introduce new ideas and rethink architectures. I hope to remove the pain points that have been a problem for strongmy direct experience as a programmer/strong./p pAnd you know what? strongI might be wrong, but donā€™t be afraid to talk with me./strong Donā€™t you recognize this is diversity too? I donā€™t claim to be right like the leaders of this community think they are. I donā€™t think that the ideas that I had ten years ago are still valid, like they do./p pThe current leaders are behaving like those old politicians who say that global warming is a scam: they get defensive because their world is criticized. They keep saying to not worry about, that the problem doesnā€™t exist, but it does./p pRails is not dead. Debating about things such as the hexagonal architecture a href=http://pivotallabs.com/hexagonal-rails-and-the-ludicrous-terminal-application/isnā€™t an assault to the framework/a, but a way to evolve as a community. If they feel under attack, we have a problem./p pstrongPlease speak up./strong/p
14 May 16:29

How to be an open source gardener

I do a lot of work on open source, but my most valuable contributions havenā€™t been code. Writing a patch is the easiest part of open source. The truly hard stuff is all of the rest: bug trackers, mailing lists, documentation, and other management tasks. Hereā€™s some things Iā€™ve learned along the way.

It was RailsConf 2012. I sat in on a panel discussion, and the number of issues open on rails/rails came up. There were about 800 issues at the time, and had been for a while. Inquiring minds wished to know if that number was ever going to drop, and how the community could help. It was brought up that there was an ā€˜Issues team,ā€™ whose job would be to triage issues. I enthusiastically volunteered.

But what does 'issue triageā€™ mean, exactly? Well, on a project as large as Rails, there are a ton of issues that are incomplete, stale, need more informationā€¦ and nobody was tending to them. Itā€™s kind of like a garden: you need someone to pull weeds, and do it often and regularly.

But before we talk about how to pull the weeds, letā€™s figure out what kind of garden we even have on our hands!

What are Issues?

The very first thing your project needs to do is to figure out what Issues are supposed to be for. Each project is different. For example, in Rails, we keep Issues strictly for bugs only. Help questions go to Stack Overflow, and new feature discussion and requests go to the rails-core mailing list. For Rust, we have issues for feature requests, meta-issuesā€¦ everything. For some repositories, closing all of the issues is not feasible, and for others, youā€™re shooting for zero. (If you donā€™t believe that this is even possible, check out Sequel. Issues are rarely even open for more than a few days!)

My personal favorite is to follow the Rails way. Ideally, youā€™d be at zero defects, and you can still have a place to discuss features. But really, having some plan is a necessary first step here.

Regular tending

So how do you tackle 800 issues? The only way I knew how: read all of them. Yep. Hereā€™s what I did: I took a Saturday (and a Sunday), and I went to the list of open Issues, then control-clicked on each one in turn to open them in a new tab. Finally, I also control-clicked on page 2. Then I closed this tab. Now I had 31 open tabs: 30 issues, and the next page. I read through the whole issue, including comments. When I got to the last tab, I was ready to repeat the process: open 30 issues, open page 3, click close. Next!

See, people think working on open source is glamorous, but itā€™s actually not. Working on open source is reading 800 issues over the course of a weekend.

Anyway, once I read all of those issues, I was significantly more informed about the kinds of problems Rails was facing. I had a whole bunch of common questions, comments, and problems.

The next step was to do it all again.

Wait, again? Why? Well, now that I had a handle on things, I could actually take on the task of triage-ing the issues. If Iā€™d tried to do it before I had the context, I might not have seen the duplicate issues, I wouldnā€™t know what the normal kinds of comments were on issues, I wouldnā€™t have known some common questions that maintainers had on pull requests, and in general, things would have just been worse.

This time, when reading the issue, I went through a little algorithm to sort them out. It looked a little like this:

  1. Is this issue a feature request? If so, copy/paste an answer I wrote that pointed them to the mailing list, and click close.
  2. Is this issue a request for help? If so, copy/paste an answer I wrote that pointed them to StackOverflow, and click close.
  3. Was this issue for an older version of Rails than is currently supported? If so, copy/paste an answer I wrote that asks if anyone knows if this affects a supported version of Rails.
  4. Did this issue provide enough information to reproduce the error? If no, copy/paste an answer I wrote that asks if they can provide a reproduction.
  5. If the issue has a reproduction, and it wasnā€™t on the latest Rails, try it against HEAD. If it still happened, leave a comment that it was still an issue.
  6. If we got to this point, this issue was pretty solid. Leave a comment that I had triaged it, and cc the maintainer of that relevant sub-system of Rails, so they could find issues that pertain to the things they work on.

At the same time I did this, I clicked this button on the GitHub interface:

Screenshot 2014-04-14 at 11.09.16 AM.png

And then set up a Gmail filter to filter all of the emails into their own tag, and to skip my inbox:

Screenshot 2014-04-14 at 11.11.16 AM.png

Why do this? Well, I didnā€™t do all 800 immediately. I decided to do one page per day. This kept it a bit more manageable, rather than taking up entire days of my time. I need these emails and filters for the important second part of the process: tending to the garden regularly.

Each morning, before I go to work, I pour a cup of coffee and check my emails. I donā€™t handle all of them before work, but I made an effort to tackle Railsā€™ emails first. There would usually be about 20 or 25 new emails each morning, and since it was largely just one new comment, theyā€™d be pretty fast to get through. 15 minutes later, I was back to current on all issues. At lunch, Iā€™d do it again: ten minutes to handle the ten or so emails by lunch, and then, before Iā€™d go to bed, Iā€™d do it again: 15 more minutes to handle the next 20 notifications. Basically, I was spending a little under an hour each day, but by doing it every day, it never got out of hand.

Once I got through all of the issues, we were down to more like 600. A whole fourth of the issues shouldnā€™t even have been open in the first place. Two weeks in is when the next big gain kicked in. Why two weeks? Well, two weeks is the grace period we decided before marking an issue as stale. Why two weeks? Well, thatā€™s kind of arbitrary, but two weeks feels like enough time for someone to respond if theyā€™re actively interested in getting an issue fixed. See, issues often need the help of the reporter to truly fix, as there just isnā€™t enough information in many bug reports to be able to reproduce and fix the problem.

So, after two weeks, I did one more thing each evening: I filtered by 'least recently updated,ā€™ and checked to see if any of those issues were stale. You just go back until they say 'two weeks,ā€™ and then, if you havenā€™t heard from the reporter, mention that itā€™s stale and give the issue a close. This is one of the other things I had to kind of let go of when working on a real project: closing an issue isnā€™t forever. You can always re-open the issue later if it turns out you were wrong. So when trying to get a handle on 800 open issues, I defaulted to 'guilty until proven innocent.ā€™ Terminate issues with extreme prejudice. Leaving old, inconclusive issues doesnā€™t help anyone. If itā€™s a real bug that matters to someone, theyā€™ll come along and help reproduce it. If not, maybe someone else will later.

After a month or two, keeping on it, we got down to 450 or so issues. Members of the core team joked that they had to set up extra email filters from me, because they could tell exactly when I was doing triage. Slow and steady wins the race!

At this point, I knew enough about Rails to actually start writing some patches. And I happened to be familiar with basically every open bug. So it was easy to start picking some of them and try to reproduce them locally. So Iā€™d do that, and then try to write a patch. If I couldnā€™t, Iā€™d at least upload my reproduction of the issue, and then leave a note on the Issue, pointing to my reproduction. That way, another team member could simply clone my repository and get to it. The only thing better than reproduction instructions are when those instructions say git clone.

But I managed to get a few patches in, and then a few more. Doing all of this janitorial work directly led the way towards attaining a commit bit on Rails. It was a long slog at first, but it just got easier the more I did it. A lot of work in open source is this way: itā€™s really easy once youā€™ve done it a lot, but is hard for newbies. Iā€™m not yet sure how to tackle this problemā€¦

Iā€™ve since taken this approach on basically every repository Iā€™ve worked on, and itā€™s worked really well. But it only works if you keep at it: if you donā€™t tend your garden, youā€™ll get weeds. I havenā€™t had as much time for Rails over the last few months, and itā€™s back to 800 issues again. Iā€™m not sure if these are real issues or not, as Iā€™ve stopped tending. But without someone actively paying attention, itā€™s only a matter of time before things get unseemly. If youā€™re looking to help out an open source project, itā€™s not a glamorous job, but all it takes is a little bit of work, and developing a habit.

(Oh, and I should take a moment to mention Code Triage here. Itā€™s super useful, and can also help you find projects that need help.)

14 May 15:47

How I imagine Git's "rewinding head to replay your work on top of it"

Dmitry Krasnoukhov

git rebase master

10 May 23:57

jadethemerman: did he give her 2 thumbs up?



jadethemerman:

did he give her 2 thumbs up?

06 May 17:55

Atom Is Now Open Source

Today, we're excited to announce that we are open-sourcing Atom under the MIT License. We see Atom as a perfect complement to GitHub's primary mission of building better software by working together. Atom is a long-term investment, and GitHub will continue to support its development with a dedicated team going forward. But we also know that we can't achieve our vision for Atom alone. As Emacs and Vim have demonstrated over the past three decades, if you want to build a thriving, long-lasting community around a text editor, it has to be open source.

What's included?

Much of Atom's functionality is provided by packages, and every Atom package has been open source since the day we launched the beta. Today, we're open sourcing the rest of Atom, which includes the core application, Atom's package manager, and Atom's Chromium-based desktop application framework, Atom Shell.

Atom Core

Atom's core contains the parts of the application that aren't provided by packages. This includes the build system, Atom's global environment, the workspace and panes, and the text editor component. Over time, we've extracted functionality from Atom into libraries that can be used independently, and we expect that process to continue.

Atom Package Manager

Atom's package manager, apm, is a client library and command line utility that facilitates publishing and installing Atom packages. apm is currently powered by atom.io, but we plan on standardizing the backend APIs so that eventually you can host your own registry.

Atom Shell

Finally, we're just as excited to be open-sourcing Atom Shell as we are about Atom itself. Over its 2.5 years of development, Atom has been something of a hermit crab, beginning its life in a Cocoa WebView, then migrating to the Chromium Embedded Framework, and finally making its permanent home inside Atom Shell. We experimented briefly with Node-Webkit, but decided instead to hire @zcbenz to build the exact framework we were imagining.

We've taken great care to integrate Chromium and Node in a clean, maintainable way, including sponsoring the addition of multi-context support in Node. We also created brightray and libchromiumcontent, which make it easier to embed Chromium into native applications as a shared library.

Into the future!

There's still a ton to do before Atom is ready for version 1.0. In the next few months, we'll be focusing on improving performance, releasing on Linux and Windows, and stabilizing APIs. We think being open source will help us get there faster, and more importantly, source access will give you the transparency and control you've told us you expect from your tools.

We'd like to thank everyone who has participated in the Atom beta so far. Your feedback, packages, and pull requests have been invaluable. We wouldn't be building a text editor if we didn't plan on using it for the rest of our lives, and we're excited to take this critical step toward making that a reality.

04 May 20:37

Photo

Dmitry Krasnoukhov

Does your app have huge sign up button?



02 May 15:15

Photo



09 Apr 13:37

MongoDB 2.6: Our Biggest Release Ever

Dmitry Krasnoukhov

So many features!

By Eliot Horowitz, CTO and Co-founder, MongoDB

Discuss on Hacker News

In the five years since the initial release of MongoDB, and after hundreds of thousands of deployments, we have learned a lot. The time has come to take everything we have learned and create a basis for continued innovation over the next ten years.

Today Iā€™m pleased to announce that, with the release of MongoDB 2.6, we have achieved that goal. With comprehensive core server enhancements, a groundbreaking new automation tool, and critical enterprise features, MongoDB 2.6 is by far our biggest release ever.

Youā€™ll see the benefits in better performance and new innovations. We re-wrote the entire query execution engine to improve scalability, and took our first step in building a sophisticated query planner by introducing index intersection. Weā€™ve made the codebase easier to maintain, and made it easier to implement new features. Finally, MongoDB 2.6 lays the foundation for massive improvements to concurrency in MongoDB 2.8, including document-level locking.

From the very beginning, MongoDB has offered developers a simple and elegant way to manage their data. Now weā€™re bringing that same simplicity and elegance to managing MongoDB. MongoDB Management Service (MMS), which already provides 35,000 MongoDB customers with monitoring and alerting, now provides backup and point-in-time restore functionality, in the cloud and on-premises.

We are also announcing a game-changing feature coming later this year: automation, also with hosted and on-premises options. Automation will allow you to provision and manage MongoDB replica sets and sharded clusters via a simple yet sophisticated interface.

MongoDB 2.6 brings security, integration and analytics enhancements to ease deployment in enterprise environments. LDAP, x.509 and Kerberos authentication are critical enhancements for organizations that require a single authentication mechanism across their entire infrastructure. To enhance security, MongoDB 2.6 implements TLS encryption, user-defined roles, auditing and field-level redaction, a critical building block for trusted systems. IBM Guardium also now offers integration with MongoDB, providing more extensive auditing abilities.

These are only a few of the key improvements; read the full official release notes for more details.

MongoDB 2.6 was a major endeavor and bringing it to fruition required hard work and coordination across a rapidly growing team. Over the past few years we have built and invested in that team, and I can proudly say we have the experience, drive and determination to deliver on this and future releases. There is much still to be done, and with MongoDB 2.6, we have a foundation for the next decade of database innovation.

Like what you see? Get MongoDB updates straight to your inbox

06 Mar 19:40

Amazing geography of First channel (Russia)

Dmitry Krasnoukhov

Fake Control localized!

Every day brings new reasons to wonder . Today we were absolutely surprised with unique understanding of geography by the ā€œPervyy kanalā€ tv channel in Russia.

140,000 refugees

In the news story ā€œMore and more Ukraine citizens arrive to the southern Russian regionsā€ First Channel claims that

more than 140 000 people crossed the Russian border in the last two weeks. Among them are not only the residents of Ukraineā€™s East and South, but also of central and western regions.

We are ready to believe the fact that inhabitants of western regions cross the border, however itā€™s claimed that people of Ukraine come to ā€œKursk, Belgorod , Rostov and Bryansk regions ā€.

Weā€™re curious whether Poland is aware that it is the part of Russia. Border crossing point Sheginy (the sign can be seen at 00:14 on the video) is situated in Lviv region and is a checkpoint on the border with Poland.

Webcams also transmit empty roads

The project team hopes that the management of the First Channel will deliver a reprimand to their employees and will force them to love geography ā€“ after all, in a ticklish propaganda and fake business such mistakes are a major failure.

26 Feb 17:39

Redis Scripting with MRuby

Dmitry Krasnoukhov

OMG RUBY INSIDE REDIS!

pa href=http://www.mruby.orgMRuby/a is a lightweight Ruby. It was created by Matz with the purpose of having an embeddable version of the language. Even if it just reached the version 1.0, the hype around MRuby wasnā€™t high. However, there are already projects that are targeting a href=https://github.com/matsumoto-r/ngx_mrubyNginx/a, a href=https://github.com/mattn/go-mrubyGo/a, a href=http://mobiruby.orgiOS/a, a href=https://github.com/mattn/mruby-v8V8/a, and even a href=https://github.com/kyab/mruby-arduinoArduino/a./p pThe direct competitor in this huge market is a href=http://www.lua.orgLua/a: a lightweight scripting language. Since the version 2.6.0 Redis introduced a href=http://redis.io/commands#scriptingscripting/a capabilities with Lua./p figure class=highlightprecode class=language-bash data-lang=bashspan/spanspan class=c1# redis-cli/span gt; span class=nbeval/span span class=s2quot;return 5quot;/span span class=m0/span span class=o(/spanintegerspan class=o)/span span class=m5/span/code/pre/figure pstrongToday is the 5th Redis birthday/strong, and Iā€™d like celebrate this event by embedding my favorite language./p h2 id=hello-mrubyHello, MRuby/h2 pMRuby is shipped with an interpreter (codemruby/code) to execute the code via a VM. This usage is equivalent to the well known Ruby interpreter coderuby/code. MRuby can also generate a bytecode from a script, via the codemrbc/code bin./p pWhatā€™s important for our purpose are the C bindings. Letā€™s write an emHello World/em program./p pWe need a *NIX OS, gcc and bison. Iā€™ve extracted the MRuby code into code~/Code/mruby/code and built it with codemake/code./p figure class=highlightprecode class=language-c data-lang=cspan/spanspan class=cp#include/span span class=cpflt;mruby.hgt;/spanspan class=cp/span span class=cp#include/span span class=cpflt;mruby/compile.hgt;/spanspan class=cp/span span class=ktint/span span class=nfmain/spanspan class=p(/spanspan class=ktvoid/spanspan class=p)/span span class=p{/span span class=nmrb_state/span span class=o*/spanspan class=nmrb/span span class=o=/span span class=nmrb_open/spanspan class=p();/span span class=ktchar/span span class=ncode/spanspan class=p[]/span span class=o=/span span class=squot;p #39;hello world!#39;quot;/spanspan class=p;/span span class=nmrb_load_string/spanspan class=p(/spanspan class=nmrb/spanspan class=p,/span span class=ncode/spanspan class=p);/span span class=kreturn/span span class=mi0/spanspan class=p;/span span class=p}/span/code/pre/figure pThe compiler needs to know where are the headers and the libs:/p figure class=highlightprecode class=language-bash data-lang=bashspan/spangcc -I/Users/luca/Code/mruby/include hello_world.c span class=se\/span /Users/luca/Code/mruby/build/host/lib/libmruby.a span class=se\/span -lm -o hello_world/code/pre/figure pThis is a really basic example, we donā€™t have any control on the context where this code is executed. We can parse it and wrap into a a href=http://www.ruby-doc.org/core-2.1.1/Proc.htmlProc/a./p figure class=highlightprecode class=language-c data-lang=cspan/spanspan class=cp#include/span span class=cpflt;mruby.hgt;/spanspan class=cp/span span class=cp#include/span span class=cpflt;mruby/proc.hgt;/spanspan class=cp/span span class=ktint/span span class=nfmain/spanspan class=p(/spanspan class=ktint/span span class=nargc/spanspan class=p,/span span class=kconst/span span class=ktchar/span span class=o*/span span class=nargv/spanspan class=p[])/span span class=p{/span span class=nmrb_state/span span class=o*/spanspan class=nmrb/span span class=o=/span span class=nmrb_open/spanspan class=p();/span span class=nmrbc_context/span span class=o*/spanspan class=ncxt/spanspan class=p;/span span class=nmrb_value/span span class=nval/spanspan class=p;/span span class=kstruct/span span class=nmrb_parser_state/span span class=o*/spanspan class=nps/spanspan class=p;/span span class=kstruct/span span class=nRProc/span span class=o*/spanspan class=nproc/spanspan class=p;/span span class=ktchar/span span class=ncode/spanspan class=p[]/span span class=o=/span span class=squot;1 + 1quot;/spanspan class=p;/span span class=ncxt/span span class=o=/span span class=nmrbc_context_new/spanspan class=p(/spanspan class=nmrb/spanspan class=p);/span span class=nps/span span class=o=/span span class=nmrb_parse_string/spanspan class=p(/spanspan class=nmrb/spanspan class=p,/span span class=ncode/spanspan class=p,/span span class=ncxt/spanspan class=p);/span span class=nproc/span span class=o=/span span class=nmrb_generate_code/spanspan class=p(/spanspan class=nmrb/spanspan class=p,/span span class=nps/spanspan class=p);/span span class=nmrb_pool_close/spanspan class=p(/spanspan class=nps/spanspan class=o-gt;/spanspan class=npool/spanspan class=p);/span span class=nval/span span class=o=/span span class=nmrb_run/spanspan class=p(/spanspan class=nmrb/spanspan class=p,/span span class=nproc/spanspan class=p,/span span class=nmrb_top_self/spanspan class=p(/spanspan class=nmrb/spanspan class=p));/span span class=nmrb_p/spanspan class=p(/spanspan class=nmrb/spanspan class=p,/span span class=nval/spanspan class=p);/span span class=nmrbc_context_free/spanspan class=p(/spanspan class=nmrb/spanspan class=p,/span span class=ncxt/spanspan class=p);/span span class=kreturn/span span class=mi0/spanspan class=p;/span span class=p}/span/code/pre/figure h2 id=hello-redisHello, Redis/h2 pAs first thing we need to make Redis dependend on MRuby libraries. We extract the language source code under codedeps/mruby/code and then we hook inside the codedeps/Makefile/code mechanisms:/p figure class=highlightprecode class=language-bash data-lang=bashspan/spanmruby: .make-prerequisites @printf span class=s1#39;%b %b\n#39;/span span class=k$(/spanMAKECOLORspan class=k)/spanMAKEspan class=k$(/spanENDCOLORspan class=k)/span span class=k$(/spanBINCOLORspan class=k)/spanspan class=nv$@/spanspan class=k$(/spanENDCOLORspan class=k)/span span class=nbcd/span mruby span class=oamp;amp;/span span class=k$(/spanMAKEspan class=k)/span/code/pre/figure p class=muted code-captionsee the a href=https://github.com/jodosha/redis/commit/c94263ee9bf129c3fce5d753554e170a94e0e7c0commit/a/p pDuring the startup, Redis initializes its features. We add our own codemrScriptingInit()/code, where we initialize the interpreter and assign to codeserver.mrb/code./p figure class=highlightprecode class=language-c data-lang=cspan/spanspan class=cp# src/mruby-scripting.c/span span class=ktvoid/span span class=nfmrScriptingInit/spanspan class=p(/spanspan class=ktvoid/spanspan class=p)/span span class=p{/span span class=nmrb_state/span span class=o*/spanspan class=nmrb/span span class=o=/span span class=nmrb_open/spanspan class=p();/span span class=nserver/spanspan class=p./spanspan class=nmrb/span span class=o=/span span class=nmrb/spanspan class=p;/span span class=p}/span/code/pre/figure p class=muted code-captionsee the a href=https://github.com/jodosha/redis/commit/61a8f4472e16edbfc0d53999e3ee3193a569d51ccommit/a/p pThen we can add another command codeREVAL/code with the same syntax of codeEVAL/code, but in our case MRuby will be in charge of execute it./p figure class=highlightprecode class=language-c data-lang=cspan/spanspan class=cp# src/redis.c/span span class=p{/spanspan class=squot;revalquot;/spanspan class=p,/spanspan class=nmrEvalCommand/spanspan class=p,/spanspan class=o-/spanspan class=mi3/spanspan class=p,/spanspan class=squot;squot;/spanspan class=p,/spanspan class=mi0/spanspan class=p,/spanspan class=nzunionInterGetKeys/spanspan class=p,/spanspan class=mi0/spanspan class=p,/spanspan class=mi0/spanspan class=p,/spanspan class=mi0/spanspan class=p,/spanspan class=mi0/spanspan class=p,/spanspan class=mi0/spanspan class=p},/span/code/pre/figure pThat codemrEvalCommand/code function will be responsible to handle that command. Itā€™s similar to the emHello World/em above, the only difference is that the code is passed as argument to the redis client (codec-gt;argv[1]-gt;ptr/code)./p figure class=highlightprecode class=language-c data-lang=cspan/spanspan class=cp# src/mruby-scripting.c/span span class=ktvoid/span span class=nfmrEvalCommand/spanspan class=p(/spanspan class=nredisClient/span span class=o*/spanspan class=nc/spanspan class=p)/span span class=p{/span span class=nmrb_state/span span class=o*/spanspan class=nmrb/span span class=o=/span span class=nserver/spanspan class=p./spanspan class=nmrb/spanspan class=p;/span span class=kstruct/span span class=nmrb_parser_state/span span class=o*/spanspan class=nps/spanspan class=p;/span span class=kstruct/span span class=nRProc/span span class=o*/spanspan class=nproc/spanspan class=p;/span span class=nmrbc_context/span span class=o*/spanspan class=ncxt/spanspan class=p;/span span class=nmrb_value/span span class=nval/spanspan class=p;/span span class=ncxt/span span class=o=/span span class=nmrbc_context_new/spanspan class=p(/spanspan class=nmrb/spanspan class=p);/span span class=nps/span span class=o=/span span class=nmrb_parse_string/spanspan class=p(/spanspan class=nmrb/spanspan class=p,/span span class=nc/spanspan class=o-gt;/spanspan class=nargv/spanspan class=p[/spanspan class=mi1/spanspan class=p]/spanspan class=o-gt;/spanspan class=nptr/spanspan class=p,/span span class=ncxt/spanspan class=p);/span span class=nproc/span span class=o=/span span class=nmrb_generate_code/spanspan class=p(/spanspan class=nmrb/spanspan class=p,/span span class=nps/spanspan class=p);/span span class=nmrb_pool_close/spanspan class=p(/spanspan class=nps/spanspan class=o-gt;/spanspan class=npool/spanspan class=p);/span span class=nval/span span class=o=/span span class=nmrb_run/spanspan class=p(/spanspan class=nmrb/spanspan class=p,/span span class=nproc/spanspan class=p,/span span class=nmrb_top_self/spanspan class=p(/spanspan class=nmrb/spanspan class=p));/span span class=nmrAddReply/spanspan class=p(/spanspan class=nc/spanspan class=p,/span span class=nmrb/spanspan class=p,/span span class=nval/spanspan class=p);/span span class=nmrbc_context_free/spanspan class=p(/spanspan class=nmrb/spanspan class=p,/span span class=ncxt/spanspan class=p);/span span class=p}/span/code/pre/figure p class=muted code-captionsee the a href=https://github.com/jodosha/redis/commit/82d67f1d83b42f3b276ebe17443a82496df05803commit/a/p pNow we can compile the server and start it./p figure class=highlightprecode class=language-bash data-lang=bashspan/spanmake span class=oamp;amp;/span src/redis-server/code/pre/figure pFrom another shell, start the CLI./p figure class=highlightprecode class=language-bash data-lang=bashspan/spansrc/redis-cli gt; reval span class=s2quot;2 + 3quot;/span span class=m0/span span class=s2quot;5quot;/span/code/pre/figure pThis was the first part of this implementation. In a future article, Iā€™ll cover how to access Redis data within the MRuby context./p pFor the time being, feel free to play with my a href=https://github.com/jodosha/redis/tree/mruby-scriptingfork/a./p