As an engineer, I want to make computers deal with tedious tasks so humans can focus on creativity. I’m also a pragmatist and want to iteratively improve workflows — I find that giving someone a single button to push every day is a drastic improvement over a 5-step process that needs babysitting.
For non-engineers, the most expedient way to transform data is often to export datasets (e.g. as CSV or plain text) and load them into Excel. Excel power users can then easily do any number of data transformations. The appeal of Excel is that users can easily visualize the data, manipulate it in an iterative fashion, and get constant feedback during the process. The downside is there are a number of cumbersome and manual processes to get the data in and out of Excel, and it’s difficult to audit and repeat the steps used to transform the data. WYSWIG tools are indeed excellent for proofs of concept, but they often need to be operationalized into a true automation pipeline.
Transposit gives you the power of relational queries across data sources anywhere
Most engineers prefer manipulating data in code, but the desire to have easy access to data and an iterative process to transform data with constant feedback is still the same. Accessing the data often involves a tedious dance with authentication. Even for basic prototyping, caching is often necessary to speed up development and stay within rate limits. To combine, slice, or aggregate data in different ways often involves some sort of re-indexing. And throughout this process, lacking a friendly interface to visualize and play with the data, developers rely on crude methods such as
println to debug and understand their data transformations.
Transposit gives you the power of relational queries across data sources anywhere — whether it lives in SaaS behind APIs or in a database in your datacenter or in the cloud. We provide developers easy access to their data, a platform for combining and transforming their data, and a friendly interface to visualize and debug these transformations during development and in production.
Recently we worked with a customer that had a typical ecommerce setup including product data, website analytics, and inventory information. All this information was exported as multiple CSV files, massaged in Excel, combined with the aid of a mysql database, and ultimately exported as a JSON file to improve website search ranking. The process was time-consuming and difficult to reproduce.
We built an unobtrusive solution connecting the various data sources as follows:
Using Transposit, we had a platform to quickly access the data and iteratively manipulate it to engineer the best solution to the problem. Connecting the customer’s public product feed to Transposit enabled us to do SQL joins and filters on that data to combine it with the other sources. We fetched and cached the inventory data from GMail and re-indexed it with Transposit’s managed Elasticsearch so we could join in aggregated inventory data. Previously, the custom Excel manipulation was done because the product URLs in the product feed did not match the website paths in Google Analytics. However, once we had access to the data, we realized that we could build a URL resolver using Transposit’s hosted functions to resolve the URL discrepancy. This previously time-consuming and manual process now runs automatically every day with tooling to help debug and understand problems when they occur.