It used to be that to even start a business in Tech, you needed expensive servers. Cloud computing solved this, and refined it as workloads can be sliced into smaller and smaller pieces, only run on demand.
We bought into the first of these, AWS Lambda, quite some time ago. At the time, only DynamoDB was available as a persistence layer, however the use of AWS API Gateway made it so that our Lambdas, to all intents and purposes, looked like a real server.
This release includes some refinement of our tagging API, a few bug fixes, and the surfacing of tags to the user interface.
Tagging Atomic changes of tag associations now work properly. Tag names and tag category names are now indexed and searchable. Tags are now visible and assignable via the UI. Dataset Parsing Fixed a major bug in our file type detection, where our file reader would prematurely exit before properly detecting the mime type.
This release, including version 0.1.78, includes the beginnings of our tagging functionality.
Tagging API (API Only) Tags and tag categories can now be created by admins. (API Only) Tags can be assigned to users’ data sets. DataSets may be searched and browsed by multiple tags. Tags are currently on an “AND” basis. Tagging UI The admin user interface now permits the creation of tags and tag categories. It does not yet permit editing those tags, or deleting them or the categories.
In this Monday’s release, we’ve added very simple support for publishing a dataset to the rest of the world.
Publishing Users may now publish their datasets and make them available to the rest of the world. The dataset search is now public, everyone can search our data. Downloading and uploading now have an explicit popup that informs the user they have to authenticate to perform these actions. Search Index Fixes Our search indexes were corrected to properly filter by publishing.
This release includes a complete replacement of our data storage layer – rather than storing things in Cassandra, we’ve moved everything over to HBase as it provides us with a couple of very significant benefits, most significant of which is the ability to actually order our rows. As you will notice, both the rows and columns of your data sets now upload and download in the order they were uploaded. No other major things were changed, so we’re going to forego the regular bullet list this time.
This release (yes, we skipped a few versions in staging) is paired with the release of our data parsing framework, and now supports the automatic detection and parsing of text files in many different data encodings. We’ve made use of the Tika framework to assist in detecting the character encoding of our text files, and can now reliably support most character sets listed here.
Upload Fixes Uploaded files are now analyzed for text encoding and parsed accordingly.
With the 0.1.72 release, we are moving into a more regular release cadence: Since our office hours are Monday and Thursday evenings, we’ll be pushing releases after those hours conclude. Last night’s updates are as follows:
Fixes to Upload Licenses, attribution, and the shared flag are no longer required. Sharing data sets has been disabled, you can now only see your own data sets (don’t worry, it’s coming back). The dataset button is now log-in-only.
With the 0.1.71 release, we are starting to address some of our user interface bugs, as well as adding some features that should simplify interacting with social websites.
UI Updates New linking for the favicon. Yay logos! Undesired scaling tended to occur when viewing the app on mobile. While we can’t exactly control user settings in mobile browsers, we can at least encourage those browsers to let us do our own layout.
A quick bug fix release this morning!
Bug Fixes User admin UI role dropdown is now disabled if you are looking at yourself. Downloading data now passes the correct mime_type query parameter, rather than mimeType. Curious? Go check it out!
This evening’s release comes with a small selection of UI improvements, and one big new feature!
New Feature: Download Data It’s somewhat funny that up until now, you couldn’t download the data you uploaded. Well, this has now been corrected, and you may download any public shared data set in any of the formats that we support. Furthermore, you can select columns on the fly, so that you don’t download anything that you don’t need.