Tale of the Whale

Privacy Policy Update

Our privacy policy has been updated to describe collection and use of website visitor data. As always, we do not collect any personally identifiable information about you, nor share data with third parties.

Appearance on REDtalks.live

Had a blast today appearing on REDtalks.live to share how Resurface got started, why sniffers and synthetics are not enough for modern apps & APIs, and how Resurface is moving beyond usage logging and becoming a deep monitoring solution.

Processing usage data with jq

jq is a command-line JSON processor that can convert and transform usage data in many different ways. But jq is a bit daunting if you're just getting started, so we've provided a bunch of examples for processing our JSON format. What's really cool is you can also download data from our demo environment directly into jq with curl or wget.

jq documentation and examples

Version 1.9.0 Released

Our biggest release yet, wrapping up all our work over the last 6 months. Includes new data privacy rules for GDPR compliance, and a free demo environment and YouTube videos to help you get started.

Logging Rules

With this release, usage logging is always done in the context of a set of logging rules. These rules describe when consent has been given to collect user data, what kinds of data may be collected, and how sensitive fields must be masked. All rules are applied within a logger before any usage data is sent.

Free Demo

Our free demo is now available to try out usage logging with pre-production apps, including use of custom data logging rules. No user registration or credit card required! Concerned about privacy and security? Check out our privacy policy and terms of service that describe data policies and protections in plain language.

YouTube Videos

Check out our new YouTube channel for tips and tutorials on usage logging, including use of our free demo environment and writing data protection rules.

API Changes

Changes to all loggers:

  • HttpLogger maintains static default rules
  • HttpLogger accepts rules on create
  • HttpLogger.format applies logging rules
  • New HttpMessage class (renamed from HttpMessageImpl and refactored)
  • New HttpRule class (for parsed rules) and HttpRules class (for rules parsing)
  • User session field accessors added to HttpRequestImpl
  • Updated README files, new API docs

Additional changes to Java logger:

  • HttpLoggerForServlets accepts rules in configuration

Additional changes to Node.js logger:

  • HttpLoggerForExpress accepts rules in create options, exposes static build method

NOTE: 1.x releases are alpha quality and should not be used in production.

Version 1.8.0 Released

A major rethink of how loggers provide request data, plus official Node.js support and a simplified API.

Logging Request Parameters

Up until this release, loggers have attempted to provide both the raw response body and the raw request body. Getting the response body is relatively easy and low risk. Getting the request body is much trickier without disrupting normal handling of the request.

A better approach, implemented in this release, is to prioritize logging request parameters already parsed by the application itself. This minimizes impact to regular request handling, while improving performance. Although logging request parameters will usually be preferred, loggers can still provide raw body content in cases where this is more convenient.

Logging from Node.js and Express

Our Node.js logger is finally finished and ready for prime time! Express applications are supported, either by adding a logger to specific routes, or by configuring middleware to log usage across all routes. All logging is done asynchronously with promises for best performance. Release builds are available at npm.js.

API Changes

Changes to all loggers:

  • HttpLogger.log() parameters reordered (request body is last)
  • HttpLogger.format() parameters reordered (request body is last)
  • HttpLogger.isStringContentType() helper method added
  • Filtering out empty request and response bodies

Additional changes to Java logger:

  • BaseServletRequestImpl added, used by HttpServletRequestImpl
  • BaseServletResponseImpl added, used by HttpServletResponseImpl
  • Additional null/sanity checks (to prevent null pointer exceptions)

Additional changes to Node.js logger:

  • Additional null/sanity checks (for both undefined and null)

NOTE: 1.x releases are alpha quality and should not be used in production.

Version 1.7.10 Released

Progress on Node.js logger, updates to JSON format, and installation from public repos.

Logging from Node.js Apps

Our new Node.js logger is at alpha quality, so not quite finished, but getting very close. It can be used for simple use-cases with Express already. All known gaps are logged as GitHub issues, and we'll finish this up incrementally now that the heavy lifting is done.

Installing from Public Repos

Our loggers are now available from community repositories for Java, Node.js and Ruby. Earlier releases had users linking to libraries on GitHub, which is harder to configure, especially when needing a specific version of the library. We'd always planned to push our open-source libraries to these community repos, and it was time to bite the bullet and get it done.

Promoting Private URLs

We tweaked both code and documentation towards the idea that every user should have a unique logging URL that is private to them. Loggers no longer allow specifying 'DEMO' as their URL, but instead expect a proper http/https URL to always be provided during construction of the logger.

Simplified JSON Schema

Our improved JSON format is a simple list of key/value pairs, rather than a map of structured objects. This makes it easier to parse and validate and hand off data to native functions (like native JSON formatting).

API Changes

Changes to all loggers:

  • HttpLogger.appendToBuffer() removed
  • HttpLogger.format() signature change

Changes to Java logger:

  • JsonMessage class renamed to Json, minor signature changes
  • Json.stringify() method added

Changes to Ruby logger:

  • Removed JsonMessage in favor of native JSON class

NOTE: 1.x releases are alpha quality and should not be used in production.

Working in Multiple Languages, All the Time

So many programming blog posts pit one language against another, often with a story of how language X was replaced with language Y at company Z with amazing success. But a project like ours, where the same basic design is implemented using multiple languages, gives perspective on what language is "the best".

more

Modern Languages are Converging

When you compare Java 8, EcmaScript 6, and Ruby 2, there are many more common traits between them than in older versions of these languages. All of these are continuing to evolve, and for the better. For example, all these languages have full Unicode support without requiring any special effort by the programmer. All have standard collections like lists and hash maps. Certainly there are concepts that are associated with just one language -- like JavaScript promises -- but mostly what you find in one modern language is present in the rest in some form.

All Unit Tests Should Look Alike

If you're going to port code across multiple languages, keeping your test cases in sync is a great way to keep sane. We hoped for our project to have a set of unit tests that are native to each language, but easy to compare side-by-side. We weren't sure how to approach this at first, but found all our unit tests look basically the same with this combination of libraries:

Every Language has Rough Edges

Unfortunately every language has some aspect that is truly odious. Java programmers have to suffer through Maven (and Maven Central) to accomplish publishing a native package to a community repository, which literally takes a few minutes with Ruby or Node. Java is also subject to the whims of Oracle. JavaScript lacks true private variables for its classes, requires ugly checks against 'undefined', and has 'truthiness', which seems deliberately punishing. Ruby's default threading model is weak and its general performance can be horrible, although it is otherwise mostly lovely. No language lacks for minor disappointment and frustration if you use it long enough.

Coming Soon - Logging From Node.js

With Java and Ruby loggers going strong, it makes sense to attempt Node.js support next. I don't mind admitting that I'm not a Node.js expert, but that doesn't deter me from trying this out in the open just like everything else this far.

resurfaceio/logger-nodejs on GitHub

Repost - Greatest Public Datasets for AI

This blog post has links to a bunch of available datasets, plus some lovely quotes like these:

  • "Democratization of data is a necessary step towards accelerating AI"
  • "Most people in AI forget that the hardest part of building a new AI solution or product is not the AI or algorithms — it’s the data collection and labeling."

Ruby Demo Movie  (6 min)

This short video shows configuring a Ruby logger (for applications using rack, rails, or sinatra) that sends data to a private listener (our lastn app). Heroku is used as the cloud platform. Karon says I sound like I'm channeling Bob Ross, which I completely take as a compliment.

Watch Ruby demo on Dropbox

Java Demo Movie  (6 min)

This short video shows configuring a Java logger (for applications using Spark framework or servlet container) that sends data to a private listener (our lastn app). Heroku is used as the cloud platform. I'm no Martin Scorsese, but I think this turned out pretty well for a first demo.

Watch Java movie on Dropbox

Version 1.6.10 Released

Sinatra support, plus a number of smaller fixes and improvements. Full API compatibility with 1.6.x.

more

Major changes:

  • Testing with Sinatra - Ruby server framework, on Heroku
  • Testing with international characters
  • Testing case-sensitivity for key headers
  • Allow HTTP destination URLs

For Java logger:

  • Fix for servlet filter blocking request parameters
  • Test for exceptions within servlet filter chain
  • Tests added for LoggedRequestWrapper
  • JsonMessage checks for null params

For Ruby logger:

  • Squelch logging 404s from rack middleware
  • Fix for invalid JSON when testing with Sinatra
  • Test for exceptions within rack/rails middleware
  • JsonMessage checks for nil params

NOTE: 1.x releases are alpha quality and should not be used in production.

Repost - Data scientists don't really enjoy collecting and organizing data!

This survey claims data preparation accounts for about 80% of the work of data scientists. No shock this is the least enjoyable part of the job!   (also good discussion on reddit)

Version 1.6 Released

Automated integration tests using Capybara, PhantomJS and Poltergeist.

more

From the start we’ve had good automated unit tests, but when it comes to testing with real applications, we’ve been manually following our README instructions for each logger & environment just like anybody who finds us on GitHub. This has produced good results, but as the number of supported environments has grown, this has become a headache.

With this milestone, we have a new set of integration tests that exercise all loggers. These tests are written in Ruby, and use real live environments (rather than mocks) to try out the latest snapshot build of each logger. It takes ~15 minutes and valid Heroku credentials to run all these tests. Unit tests present in each logger were unchanged.

Why standardize on Ruby for integration tests? Because Ruby is a good tool for this job. It's easy to call system commands through backticks, while wrapping those operations in exception handlers and closures. Using a headless browser that can run JavaScript (Capybara+Poltergeist+PhantomJS) makes it easy to drive test applications. And in the end these Ruby test fixtures are quite small and readable and easy to maintain.

Major changes:

  • Integration test for Jetty - Java web server, on Ubuntu and OS X
  • Integration test for Rack - Ruby server framework, on Heroku
  • Integration test for Rails - Ruby application framework, on Heroku
  • Integration test for Spark - Java microservice framework, on Heroku
  • Integration test for Tomcat - Java web server, on Ubuntu and OS X

For all loggers:

  • Expect lastn test app to return JSON rather than HTML output
  • Checked and updated legal notices

NOTE: 1.x releases are alpha quality and should not be used in production.

Repost - The life-changing benefits of side projects

There was a truly awesome blog post by Grant Ammons recently extoling the virtues of side projects. He adeptly makes the case for having a side project for the pure fun of self-improvement. It's a post I really wish I had writen myself...but I was far too busy working on my side project. :-)

The life-changing benefits of side projects

Version 1.5 Released

Send logging messages to any HTTPS URL that accepts a JSON POST, plus new environment variables to control operations.

more

Major changes:

  • Move from using Singleton instances (via HttpLoggerFactory) to multiple instances (controlled by UsageLoggers)
  • Send data to one or more URLs simultaneously, configuration for filters/middleware takes URL parameter
  • New environment variables to control default URL and global enabled status
  • Tracing API replaced with use of queue/list supplied to logger when instantiated
  • Loggers inspect URL and do not allow themselves to be enabled when misconfigured

For all loggers:

  • HttpLoggerFactory replaced by UsageLoggers
  • BaseLogger.DEFAULT_URL replaced by UsageLoggers.url_by_default
  • UsageLoggers.url_for_demo used in place of "DEMO" literal
  • BaseLogger methods removed: active?, tracing?, tracing_history, tracing_start, tracing_stop

NOTE: 1.x releases are alpha quality and should not be used in production.

Version 1.4 Released

Logging request method and request/response headers, plus some minor refactorings.

more

Major changes:

  • Formatting requests and responses together, rather than as separate messages
  • All attributes are strings, including those known to be numbers
  • Request method & request/response headers are being logged

For all loggers:

  • UsageLogger renamed to BaseLogger
  • BaseLogger.agent is set through constructor
  • BaseLogger.post renamed to BaseLogger.submit
  • HttpLogger.formatEcho/logEcho removed
  • HttpLogger.formatRequest/formatResponse replaced with format()
  • HttpLogger.appendToBuffer added, used by format()
  • HttpLogger.logRequest/logResponse replaced with log()

For Java logger:

  • BaseLogger.agent() renamed to getAgent()
  • BaseLogger.url() renamed to getUrl()
  • BaseLogger.version() renamed to getVersion()

For Ruby logger:

  • logger.rb renamed to all.rb

NOTE: 1.x releases are alpha quality and should not be used in production.

Version 1.3 Released

Logging body content from POST requests.

more

For Java logger:

  • Added HttpLogger.formatRequest(StringBuilder json, long now, HttpServletRequest request, String body)
  • Added HttpLogger.logRequest(HttpServletRequest request, String body), plus overloaded version that sets body to null

For Ruby logger:

  • Added HttpLogger.format_request(json, now, request, body=nil)
  • Added HttpLogger.log_request(request, body=nil)

NOTE: 1.x releases are alpha quality and should not be used in production.

Version 1.2 Released

Added request/response implementation classes, for easy mocking and manual use.

more

New classes:

  • For Java logger: HttpServletRequestImpl, HttpServletResponseImpl
  • For Ruby logger: HttpRequestImpl, HttpResponseImpl

Other changes:

  • Refactored out UsageLogger abstract base class, which handles enabling/disabling, tracing API methods, and standard constructors for all usage loggers. UsageLogger provides a DEFAULT_URL constant to use in place of HttpLogger.URL.
  • HttpLogger constant SOURCE has been renamed to AGENT (with abstract accessor method), intended to be populated with the literal name of the logger source file. (such as 'HttpLogger.java' or 'http_logger.rb')

NOTE: 1.x releases are alpha quality and should not be used in production.

Sharing all of the codes (on GitHub)

When starting this project, I had no idea where this might lead. I was unemployed, hungry to start my own venture, but without product or revenue primed and ready. I'm once again a working stiff with a day job, but this has been too much fun not to keep going. As paying the bills is no longer a concern, we'll be sharing all project source code (warts and all) on GitHub moving forward.

more

I've always thought that any resurface.io libraries to be bundled with a customer application should be open source. That's what I prefer for my own apps, not for philosophical reasons, but from real frustrations dealing with opaque commercial libraries. I have no problem with proprietary software, but any code I integrate with my app then becomes my responsibility to my customers. Having access to sources allows me to troubleshoot when things go wrong, and maybe to contribute an improvement or two along the way.

But we're not just sharing library sources, we're now sharing everything, including sources for this website and other project materials. It just seems wrong to hold back, when we had always planned to share big portions of our codebase anyway. This doesn't preclude having some commercial components way down the road. This does preclude licensing-driven revenue models that aren't very interesting anyway. So let's share all the code, why not?

resurface.io on GitHub

Yahoo news dataset as anti-pattern

Yahoo did something honorable last week in releasing their R10 Yahoo News Feed dataset, the largest public dataset of its kind ever released. (so went the headlines) At first I was excited that it might provide good patterns to follow. Not so much!

more

Again, Yahoo deserves praise for doing something admirable here...I think anyway. I can't be sure because Yahoo is restricting access to academic researchers only. I'm certainly not the only independent researcher to find this restriction lame, but that's not the sole complaint.

Beyond Yahoo's description on their blog, there apparently exists a readme file that provides more technical information, but it's buried in part 1 of this massive download. Having to download a big tarball just to access a tiny readme file is uncool. If there is a detailed schema of the data, that would be worth seeing without too much trouble.

If technical details aren't forthcoming, at least provide some sample data, which can be totally fake and lawsuit-free. The dataset has to be chunked up anyway for downloading. Hard to imagine that a few laptop-sized chunks couldn't be carved off safely for use by the general public. Even a few examples in text might help to see if the dataset is worth exploring.

And what about metadata? Starting with metadata about the parts of the download. Let me grab the readme, schema, sample data and metadata in a small package, and then start in on big chunks of data.

And why a big download at all? (as opposed to EBS snapshot or the like?)

Again, I don't mean impune the fine folks at Yahoo. But I can't help thinking that an archive half the size but twice as open and self-descriptive would have been better.

Repost - The Quartz guide to bad data

There was a really great post by the staff at Quartz last month that I missed somehow. It details 43 different ways that your dataset could need fixing. Though written from a reporter's perspective, it's a great checklist for anybody working with datasets.

'The Quartz guide to bad data' on GitHub

Our first (kinda lame) GitHub repository

I love Bootstrap. I love putting together a quick polished UI without having to fight browser quirks. So I reach for Bootstrap, Bootswatch themes, and PNotify (Bootstrap-friendly notifications) whenever possible. As I'm usually using these libraries together, why not treat them as a single unit?

more

I'm going to have to maintain this "distribution" whether I choose to give it a name or share it. I'll be upgrading and testing those components together as a unit either way. A single bundle would help me in building multiple sites even if it's not a huge help to anybody else. Granted this distribution is arguably lame...it doesn't introduce anything new, just bundles existing code. Should I really bother to share this?

In my mind, yes. There's no extra cost associated with sharing, especially as GitHub provides public repos for free. I might save somebody a little time. I might have ideas to tweak or customize down the road, making this distribution less generic and more interesting. In a larger sense, this is a precedent for how we intend to participate in the open-source community. Are we contributors, or just lurkers? Let's err toward contributing!

resurfaceio/bootdarkly on GitHub

Starting from scratch, yikes!

Starting a business has been my quiet dream for a while now. Not for lack of enjoyment in my work in recent years, far from it. My last company (Xaffire) was a great ride. We went from tiny startup (20 people) to part of Quest Software (4k people) to part of Dell (>100k people). I grew to love working as part of an international team, having the freedom to follow my own research, and representing my group at public events. All this conspired to keep the dream of my own business on the back burner...but that couldn't last forever.

more

Dell is a big company in big transition, and I was laid off (as part of a much larger layoff) in November 2015. It was a bit of a shock initially, but no hard feelings. I have many friends at Dell for whom I wish the best. If this event is what put me on track to owning my own business, I'm grateful for it.

Obviously a lot has changed in the decade since Xaffire was a startup. Monitoring/logging tools that rely on packet sniffing (like TeaLeaf, Coradiant, and Xaffire) have seen their time in the sun dimmed by surging numbers of cloud deployments. Newer user recording tools (like Clicktale, Inspectlet, and Lucky Orange) are available for customers running in cloud environments. At this point, it is a leap of faith that resurface.io can carve out a distinct niche, but obvious that doing so requires a re-think from first principles.

So I find myself starting from scratch: without any other contributors, without any inventions or intellectual property, without any intent to violate agreements to my former employer, and without any really firm ideas yet about positioning or implementation or go-to-market strategy. All I have is a domain name, an exceedingly vague tagline, and the drive to explore what comes next. This could be a stroke of genius, or utter folly...but you gotta start somewhere, right?

Expand All