Journal Archives About Books Talks
linked

A CLI for Amazon Athena

I've been enjoying Amazon Athena to analyze our application event data.

In case you haven't played with it yet, Athena allows you to query data in S3 using SQL. My only complaint so far is having to use the web interface to manage schemas and run queries. Since Amazon offers a JDBC driver for Athena, I decided to build my first JRuby app - a command line interface for Athena catalogs.

You can run queries:

❯ cat queries/count-by-port.sql
SELECT COUNT(*) AS count, elb_name
FROM sampledb.elb_logs
GROUP BY elb_name
ORDER BY count DESC
LIMIT 10;

❯ athena query queries/count-by-port.sql
COUNT  | ELB_NAME
-------|-------------
151901 | elb_demo_006
151886 | elb_demo_009
151753 | elb_demo_001
151284 | elb_demo_002
151062 | elb_demo_004
150503 | elb_demo_008
149934 | elb_demo_005
149122 | elb_demo_007
148761 | elb_demo_003

manage schemas,

❯ athena table show sampledb.elb_logs
CREATE EXTERNAL TABLE `sampledb.elb_logs`(
  `request_timestamp` string COMMENT '',
  `elb_name` string COMMENT '',
  `request_ip` string COMMENT '',
  `request_port` int COMMENT '',
  `backend_ip` string COMMENT '',
  `backend_port` int COMMENT '',
  `request_processing_time` double COMMENT '',
  `backend_processing_time` double COMMENT '',
  `client_response_time` double COMMENT '',
  `elb_response_code` string COMMENT '',
  `backend_response_code` string COMMENT '',
  `received_bytes` bigint COMMENT '',
  `sent_bytes` bigint COMMENT '',
  `request_verb` string COMMENT '',
  `url` string COMMENT '',
  `protocol` string COMMENT '',
  `user_agent` string COMMENT '',
  `ssl_cipher` string COMMENT '',
  `ssl_protocol` string COMMENT '')
ROW FORMAT SERDE
  'org.apache.hadoop.hive.serde2.RegexSerDe'
WITH SERDEPROPERTIES (
  'input.regex'='([^ ]*) ([^ ]*) ([^ ]*):([0-9]*) ([^ ]*):([0-9]*) ([.0-9]*) ([.0-9]*) ([.0-9]*) (-|[0-9]*) (-|[0-9]*) ([-0-9]*) ([-0-9]*) \\\"([^ ]*) ([^ ]*) (- |[^ ]*)\\\" (\"[^\"]*\") ([A-Z0-9-]+) ([A-Za-z0-9.-]*)$')
STORED AS INPUTFORMAT
  'org.apache.hadoop.mapred.TextInputFormat'
OUTPUTFORMAT
  'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION
  's3://athena-examples-us-east-1/elb/plaintext'
TBLPROPERTIES (
  'transient_lastDdlTime'='1480278335')

list and rebuild partitions, and more.

Setup instructions and full usage are in the GitHub repository.

linked

Twitter and the ephemeral rant

I rather enjoyed Garen's Twitter rant on Twitter rants:

Threaded replies and quoted retweets have transformed Twitter. It used to be mostly a carrier wave for more permanent content. Now it feels like you're always starting in the middle of the conversation. Every tweet is a subtweet.

linked

Right tool for the job

"Use the right tool for the job" is an oft-repeated aphorism in the tech world. In my experience, I have found this aphorism to be said a lot but actually implemented rarely and to also omit the reality of how technology choices are made.

I nodded along with Peter's real world translations for this overused phrase in tech. Development is about tradeoffs. There is rarely a single right tool for any software job.

linked

We’re Generation X, and we've got this.

As a middle kid from Generation X, I relate to quite a bit from David Barnett's piece in the Independent on the apparent war between Boomers and Millennials:

The boomers don’t like the millennials because they think the younger generation are feckless, whiny snowflakes who are scared of hard graft and obsessed by status, more interested in posting a selfie to social media than doing anything useful.

The millennials, on the other hand, see the moomers as a rapacious generation that’s pretty much ruined everything for them. They’re living too long, taxpayers’ money is gushing into looking after them. They’ve kept house prices high, meaning young people can’t afford to buy. Workplace pensions are rapidly becoming a thing of the past. Boomers are, by and large, Brexiteers and Trumpers. They remember when Britain was great, and think coming out of Europe will be a doddle. They want to make America great all over again.

The problem with you millennials and boomers, though you’d never admit it, is you’re too alike. You’re both insular, in different ways. You’re both selfish. You’re both so blinkered, you think you’re the only two factions in this petty little fight of yours.

I've grown tired of the back and forth. Both sides should chill. We may be in apocalpytic times as our headlines and art want you to believe. But I wonder how much of that is self fulfilling handwringing. Like the author, I identify with the best in the generations before and after mine, and I'm more optimistic for our future:

You forgot about Generation X.

But don’t fret, we’re still here. Working hard, playing hard, innovating, learning from the past and planning the future. So have your little generational war, and when you’re done, don’t worry.

We’re Generation X, and we've got this.

Looking for more? View the archives or grab the feed.