Posts in Category: Technical

RubyConf 2018 is about to start, so let’s talk about RubyConf 2017!

RubyConf 2018 starts tomorrow, and just like I did with RailsConf, I’m very belatedly going to share some highlights from RubyConf 2017, which was in New Orleans last November. It was my first time attending RubyConf, and what struck me the most was the really strong sense of community. Here’s what one first-time attendee had to say:

…This conference was so incredibly worth it. I learned about sweet gems, cool projects, and job opportunities. But more importantly, I met SO MANY totally epic and amazing individuals that even after only three short days I happily now consider friends. I cannot wait to follow their coding lives and journeys in the years to come. I am confident that so many of them are going to do great and groundbreaking things. Plus, I cannot WAIT for my next RubyConf.

That’s from the post 31 thoughts I had while attending my first #RubyConf as an Opportunity Scholar. RubyConf’s Opportunity Scholar program provides financial support for folks who wouldn’t be able to attend otherwise, and are getting started with Ruby. The Scholars are then each matched with a Guide – experienced people who can help them navigate the conference, and make connections for professional development and job opportunities. I applied to be a Guide for this year’s RubyConf and I was selected – I’m looking forward to it!

RubyConf has three tracks of talks, so it’s not possible to attend them all, but here are the ones that were my favorites, including links to the videos for each of them:

  • Live Coding Music with Sonic Pi – this was a really fun talk on Sonic Pi, which Sam Aaron live-programmed while DJing the after-party that night. Here’s video of the talk and a short clip of him DJing:
  • There’s Nothing.new under the sun – this talk includes highlights from some of the best conference talks in the history of Ruby, which required a huge research effort by the presenters. It’s also a great introduction to what makes the Ruby community special. The presenters’ resource list includes links to the talks that the highlighted. Video
  • Code Reviews: Honesty, Kindness, Inspiration: Pick Three – this was my favorite talk, as doing code reviews effectively is one of the greatest challenges teams face, and this talk included a number of innovative and fantastic ideas for doing them well. Video
  • You Are Insufficiently Persuasive – Sandi Metz’ keynote – need I say more? It’s an excellent talk on working well with others: why it’s important, how to do it, and how not to do it. Video
  • High Cost Tests and High Value Tests – an excellent overview of the costs and benefits of different types of tests, and slow tests. Slides | Video
  • Deterministic Solutions to Intermittent Failures – almost all large tests suites I’ve seen over the years have at least some challenges with intermittent failures (flaky tests). This talk consists of hard-won – and refreshingly specific – advice on how to address these challenges. Video
  • Git Driven refactoring – this talk showed me ways of using Git that I’d never thought of before, to make your code better, and also a good introduction to the SOLID principles. Slides | Video

And since the conference was in New Orleans, I now have to show you pictures from some of my time spent outside the conference…

RailsConf 2017 in tweets, and my “Why Do Planes Crash?” lightning talk

RailsConf 2018 starts in exactly one month, and I’m looking forward to it! This means I should probably get around to saying something about RailsConf 2017. The video above is cued to start at the beginning of a lightning talk I gave. The title was “Why Do Planes Crash? Lessons for Junior and Senior Developers.” Analyses of plane crashes show planes actually crash more often when the senior pilot is in the flying seat, often because junior pilots are reticent to speak up when they see problems, while senior pilots don’t hesitate to do so when the junior pilot is flying. There are some great lessons developers can apply from this for how to do mentoring and pair programming.

The lightning talks were at the end of the 2nd day, and I made a last minute decision that morning to sign up and put a talk together. I’ve given a number of conference talks before, but never to a crowd this big, and never with so little time to prepare. Then when it was time to give the talk, there was a technical issue that prevented me from seeing my notes, so I had to wing it. Under the circumstances I think it still turned out ok. Here are my slides (they’re also embedded below) and some tweets about the talk:

I work for ActBlue and we provided Opportunity Scholarships for people who normally wouldn’t be able to attend, for financial or other reasons.

4 of us from ActBlue attended, and my co-worker Braulio gave an impressive full-length talk explaining how our technical infrastructure supports close to 8,000 active organizations, and handles peak traffic like the 2016 New Hampshire primary night, when our traffic peaked at 300,000 requests per minute and 42 credit card transactions per second.

Here are some other highlights from the conference…

Video of Marco Roger’s talk mentioned above.

A group of us took in a Diamondback’s game the night the conference ended, and then the next morning a couple of us headed to the Desert Botanical Garden before flying home.

Lastly, here are the slides from my lightning talk:

My new job at PromptWorks, and thoughts on developer interviews

I’m excited to officially start my new job at PromptWorks next week. The slogan on their website says it all: “we are craftsmen.” If you’ve seen my Clean Code talk, you know what software craftsmanship means to me. An important aspect of it is to keep improving your skills. I’ve been working at PromptWorks on a contract basis for the past several weeks, and I can tell already that I will learn a lot from my new co-workers. They place a strong emphasis on Agile practices, quality, and working at a sustainable pace. I’ve seen enough so far to know that this isn’t just talk, and that their focus is on building long-term relationships with their clients and their staff. They’re also very involved in the local tech community. Among other things, one of them oversees the philly.rb Ruby meetup group.

They’re also supportive of me working remotely while Post to Post Links II error: No post found with slug "living-in-fukuoka-japan-this-summer-and-fall", which is very generous of them (especially for a new hire).

I interviewed with several different companies recently, and for me, the most dreadful part of interviewing is being asked to do live coding. This is sometimes done in the form of a pop-quiz, where I’m presented with some out of the ordinary coding problem, and I’m expected to write code on a whiteboard, or hack together a quick script to solve it. Other times it’s a surprise mini-project I’m expected to do on the spot. Even though I’ve been coding for close to 20 years, and I’ve had plenty of experience doing quality work faster than expected, I’m terrible at these coding exercises.

The issue for me is that they are nothing like doing real work. The only times in my life I’ve had to think up code on the spot for a surprise problem and write it on a whiteboard is in interviews. And in a real job I don’t think I’ve ever had a project dropped on me out of the blue and been asked to code up a solution in an hour or two, with severe consequences if I make a mistake or try to talk to anyone about it.

My thinking process is largely driven by understanding context (the context of the code, and the context of the business problem), and these coding exercises are usually devoid of context. I’ve also trained myself over the years to not just hack things together. I was told in one interview that, sorry, you won’t have time to write tests. Telling me to take my best practices and throw them out the window in an interview strikes me as completely backwards.

How to best interview programmers is a hotly debated topic. Some very respected people, like Joel Spolsky, swear by the whiteboard-coding approach. Others say you’re doing it wrong:

A candidate would come in, usually all dressed up in their best suit and tie, we’d sit down and have a talk. That talk was essentially like an oral exam in college. I would ask them to code algorithms for all the usual cute little CS problems and I’d get answers with wildly varying qualities. Some were shooting their pre-canned answers at me with unreasonable speed. They were prepared for exactly this kind of interview. Others would break under the “pressure”, barely able to continue the interview…

But how did the candidates we selected measure up? The truth is, we got very mixed results. Many of them were average, very few were excellent, and some were absolutely awful fits for their positions. So at best, the interview had no actual effect on the quality of people we were selecting, and I’m afraid that at worst, we may have skewed the scale in favor of the bad ones…

So what should a developer job interview look like then? Simple: eliminate the exam part of the interview altogether. Instead, ask a few open-ended questions that invite your candidates to elaborate about their programming work.

– What’s the last project you worked on at your former employer?
– Tell me about some of your favorite projects.
– What projects are you working on in your spare time?
– What online hacker communities do you participate in?
– Tell me about some (programming/technical) issues that you feel passionately about.

When I became Director of the web team at the Penn School of Medicine, I led an overhaul of how we conducted our interviews, and we adopted questions similar to these. We focused on behavior-description questions, which are actually much more revealing than you might think, if you haven’t tried them before. We also asked for interviewees to bring in a sample of their code, and we’d have them talk us through it in the interview, and answer any questions we had about it. This was an excellent and reliable way to get an understanding of their experience level and getting past shyness and nervousness. For anyone who’s done half-way decent work, they always become animated when showing off work they’re proud of.

For my interview with PromptWorks, they gave me a small project to do on my own time, to turn in a few days later, which is also a good approach. Apart from that, they also had me do a pair programming exercise, which I was worried about at first, but the focus was on getting an understanding of my thought process and overall problem-solving approach, as opposed to how fast I could tear through it, or trying to hit me with “gotcha” questions.

And they hired me, so I must have gotten something right 😉

Data IO 2013 conference – my notes

These are my notes from today’s Data IO conference

Next Generation Search with Lucene and Solr 4

Speaker’s slides

Lucene 4

  • near real time indexes (used by Twitter for 500 million new tweets/day)
  • can plug in your own scoring model
  • flexible index formats
  • much improved memory use, regexs are faster, etc
  • new autocomplete suggester

Solr (Lucene server – managed by the same team as Lucene)

  • if someone chooses the red shirt, do we have large in stock (pivot faceting – anticipating the next question)
  • improved geo-spatial (all mexican restaurants within 5 blocks, plus function queries to rank them)
  • dstributed/sharded indexing and search
  • solr as nosql data store

Uses

  • recommendation engine (LinkedIn uses Lucene for people recommendations). Recommend content to people who exhibit certain behaviors
  • avoid flight delays – one facet – flights out of airports, pivot to destination airports (Ohare to Newark) – origin, destination, carrier, flight, delay times – look at trends over time. Solr has a stats package – you can get averages, max, min, etc
  • for local search, how to show only shops that are open? (Yelp also uses Lucene). 

You added Zookeeper to your stack, now what?

Old way of system management: active and backup servers, frantically switch to backup when active fails

Common challenges with big distributed system

  • Outages
  • Coordination
  • Operational complexity

A common deficiency: sequential consistency (handling everything in the “right” order, when data is coming from multiple places)

  • Zookeeper is a distributed, consistent data store – strictly ordered access
  • Can keep running as long as only a minority of member nodes are lost (usually want to run 3 or 5 nodes)
  • all data stored in memory (50,000 ops/sec)
  • optimized for read performance (not write); also not optimized for giant pieces of data
  • it’s a coordination service
  • A leader node is elected by all the members. Leader manages writes (proposes it to followers, they acknowledge it, then it is assigned and written)
  • nodes can have data, and can have child nodes
  • has “ephemeral nodes” – created when a client connects, destroyed when client disconnects (these do not ever have child nodes)
  • watches: clients can be kept informed about data state changes (just lets you know it has changed, but not what it’s changed to – you need to request it again if you want to know the current value)

Zookeeper open-source equivalent of Chubby

  •  good for discovery services (like DNS)
  • Use cases: Storm and HBase, Redis – http://www.slideshare.net/ryanlecompte/handling-redis-failover-with-zookeeper
  • Distributed locking

Beware – Zookeeper can be your single point of failure if you don’t have appropriate monitoring and fallbacks in place

Graph Database Use Cases

  • nodes connected by relationships
  • no tables or rows
  • nodes are property containers
  • cypher is neo4j’s query language
  • http://github.com/maxdemarzi/neo_graph_search
  • http://maxdemarzi.com/2012/10/18/matches-are-the-new-hotness
  • also used for content management, access control, insurance risk analysis, geo routing, asset management, bioinformatics
  • “what drug will bind to protein X and not interact with drug Y?”
  • http://neovisualsearch.maxdemarzi.com
  • performance factors: graph size (doesn’t matter), query degree (this is what matters – how many hops), graph density. RDBMS doesn’t scale well with data size, neo4j does
  • the more connected the data, the better it fits a graph db,
  • NoSQL – 4 categories – key value, column family, document db, graph db
  • popular combo is, e.g. mongo for data, neo4j for searching it (then hydrate the search results from mongo)
  • optimized for graph traversal, not, e.g., aggregate analysis of all the nodes
  • top reasons to use it: problems with RDBMS join performance, evolving data set, domain shape is naturally a graph, open-ended business requirements
  • Gartner’s 5 graphs: interest, intent. mobile, payment

Parquet

I didn’t take notes during those one (a drop of water from the bottom of my glass got under my Mac trackpad, and my mouse was going crazy for a while)

All the data and still not enough?

  • No matter how much data you have, it’s never enough or never seems like the right type
  • Predictive modeling – will someone default on a loan? Look at data for people who’ve had loans, and who defaulted and didn’t. Use data to make a predictive risk model
  • IID = independent and identically distributed

Example IBM sales force optimization

  • Can we predict where the opportunities are – which companies have growing IT budgets?
  • Couldn’t know what was most important – where were these target companies spending their IT budget (not disclosed)
  • Companies who are spending with us are probably representative of similar sized companies in the same market – use the “nearest neighbor” technique
  • Compared model prediction to expert salesmen’s opinions, except for 15% of them, where the expert’s put the chances at zero. Why the difference? The model mis-identified some of the companies (no good way to cross-reference millions of internal customer records with independent sources)

Siemens – compter aided detection of breast cancer

  • patient IDs ended up predicting odds for cancer. It turns out the ID was a proxy for location (whether they were at a treatment facility or a screening facility)

Display ad auctions – how do we decide who to target?

  • multi-armed bandit – exploration vs exploitation
  • what do we know? urls you’ve visited
  • for something like advertising luxury cars, very few positive examples (people don’t buy them online)
  • There is no correlation between ad clicks and purchases
  • Better to look at – did the person eventually end up at the company’s home page at some point after seeing the ad?
  • target people who the ad can actually influence (i.e. not people who already bought the product, or never will)
  • but there’s no way to get data for that
  • Counterfactuals – you can’t both show and not show someone and ad, and observe subsequent behavior. You have to either show it or not show it
  • Ideally, build a predictive model for those who see the ad, and another model for those who don’t
  • But the industry doesn’t do that – it’s all about conversion rate

Advertising fraud

  • Malware on sites generating http requests
  • Very difficult for ad auctions systems to detect
  • Detect by looking at traffic between sites. Foe example, malware site womenshealthbase generates massive traffic to lots of other sites, not about womens health
  • they make money by visiting a site with a real ad auction system. Then bid prices go up because of your traffic, which drives up ad revenue traffic on womenshealthbase
  • Auction systems now put visitors from these sites in a penalty box, until they start displaying normal behavior again

What’s new with Apache Mahout?

  • Amazon: customers who bought this item also bought this item – best known Mahout example
  • Mahout implemented most of the algorithms in the Netflix recommendation contest
  • In general, finds similarities between different groupings of things (clustering)
  • Key features: classification, clustering, collaborative filtering
  • Automatically tags questions on stack overflow

Uses

  • recommend friends, products, etc
  • classify content into groups
  • find similar content
  • find patterns in behavior
  • etc – general solution to machine learning problems

[I’m leaving it most of the details about performance improvements and the roadmap for upcoming refinements – below are other interesting points]

  • Often used with Hadoop, but it’s not necessary. Typically included in Hadoop distributions
  • Streaming K-means – given centroids (points at the center of a cluster) determine which clusters other points belong in
  • References: Mahout in Action (but a bit out of date), Taming Text http://mahout.apache.org
  • Topic Modeling He wasn’t sure what the full feature set is – he’s pretty sure it doesn’t generate topic labels for you

Doug Engelbart passes away

If you use a mouse, hyperlinks, video conferencing, WYSIWYG word processor, multi-window user interface, shared documents, shared database, documents with images & text, keyword search, instant messaging, synchronous collaboration, asynchronous collaboration — thank Doug Engelbart

That quote is from one of Engelbart’s peers. It’s worth taking a few minutes to read the rest of his post, to learn about Doug Engelbart. Personal computing and the internet would not be what they are if it weren’t for his contributions.

About 14 years ago, when Maria and I worked at Stanford, we had dinner with him and his girlfriend, and another couple. He couldn’t have been more pleasant and down to earth. At the time I knew a bit about his history, but not the full extent of his contributions. And I left that dinner still not knowing – he was a modest man. Dave Crocker is someone who worked with him, and he wrote the following last night, after Engelbart’s daughter shared the news of his passing: “Besides the considerable technical contributions of Doug’s project at SRI, theirs was a group that did much to create the open and collaborative tone of the Internet that we’ve come to consider as automatic and natural, but were unusual in those days.”

The San Jose Mercury today re-published a profile of him from 1999:

But the mild-mannered computer scientist who created the computer mouse, windows-style personal computing, hyperlinking–the clickable links used in the World Wide Web–even e-mail and video conferencing, was ridiculed and shunted aside. For much of his career he was treated as a heretic by the industry titans who ultimately made billions off his inventions…

Engelbart is perhaps the most dramatic example of the valley’s habit of forgetting engineers whose brilliance helped build companies–and entire industries. CEOs fail to mention them in corporate press releases; they never become household names. Yet we use their products, or the fruits of their ideas, every day…

“We were doing this for humanity. It would never occur to us to try and cash in on it. That’s still where Doug’s mind is,” explains Rulifson, director of Sun’s Networking and Security Center…

Engelbart’s unwillingness to bend was in evidence when he met Steve Jobs for the first time in the early 1980s. It was 15 years since Engelbart had invented the computer mouse and other critical components for the personal computer, and Jobs was busy integrating them into his Macintosh.

Apple Computer Inc.’s hot-shot founder touted the Macintosh’s capabilities to Engelbart. But instead of applauding Jobs, who was delivering to the masses Engelbart’s new way to work, the father of personal computing was annoyed. In his opinion, Jobs had missed the most important piece of his vision: networking. Engelbart’s 1968 system introduced the idea of networking personal computer workstations so people could solve problems collaboratively. This was the whole point of the revolution.

“I said, ‘It [the Macintosh] is terribly limited. It has no access to anyone else’s documents, to e-mail, to common repositories of information, “‘ recalls Engelbart. “Steve said, ‘All the computing power you need will be on your desk top.”‘

“I told him, ‘But that’s like having an exotic office without a telephone or door.”‘ Jobs ignored Engelbart. And Engelbart was baffled.

We’d been using electronic mail since 1970 [over the government-backed ARPA network, predecessor to the Internet]. But both Apple and Microsoft Corp. ignored the network. You have to ask ‘Why?”‘ He shrugs his shoulders, a practiced gesture after 30 frustrating years…

Here is a set of highlights from his famous 1968 demo of the systems his team developed, showing early versions of computer software and hardware we now consider commonplace. In the 8th video, he shows their online, collaborative document editing system, which looks like an early version of Google Docs. In the 3rd video, he describes the empirical and evolutionary approach they took to their development process. This was another of his ideas that the industry discarded, only to finally re-discover its value, more than 30 years later, as what’s now called Agile development.

BarCamp NewsInnovation and TransparencyCamp

My presentation with Keya Dannenbaum at TransparencyCamp: "Civic engagement, local journalism, and open data"My presentation with Keya Dannenbaum at TransparencyCamp: "Civic engagement, local journalism, and open data"
My presentation with Keya Dannenbaum at TransparencyCamp: "Civic engagement, local journalism, and open data"05-May-2013 08:54
 

After my WordCamp Nashville presentation, I transitioned from talking about how to write clean code, to talking about how the web is transforming the world of journalism, and what it means for civic engagement. This was the topic of the BarCamp NewsInnovation talk two weeks ago in Philadelphia given by Dave Zega and I (we work together at ElectNext). I also presented a longer, more in-depth version at TransparencyCamp in Washington, DC last week, with our CEO, Keya Dannenbaum.

Both conferences were “unconferences,” which means there’s an emphasis on discussion rather than long presentations, and the schedule is determined by the conference participants themselves, on the morning of the conference. However, both had some pre-scheduled talks, including ours.

The unconference board at #bcni13 - on the fly conference planning, with opportunities for anyone to presentThe unconference board at #bcni13 - on the fly conference planning, with opportunities for anyone to present
The unconference board at #bcni13 - on the fly conference planning, with opportunities for anyone to present27-Apr-2013 23:19, Canon Canon PowerShot ELPH 110 HS, 2.7, 4.3mm, 0.033 sec, ISO 160
 
The virtual unconference board at TransparencyCampThe virtual unconference board at TransparencyCamp
The virtual unconference board at TransparencyCamp05-May-2013 00:35, Canon Canon PowerShot ELPH 110 HS, 2.7, 4.3mm, 0.02 sec, ISO 125
 

The TransparencyCamp talk was titled “Civic engagement, local journalism, and open data.” Here’s the summary:

A fundamental purpose of journalism in the United States is to inform citizens, so that they can effectively engage in democratic self-governance. The ongoing disappearance of local newspapers in the digital era is well known, resulting in the decline of traditional watchdog journalism at the local and state levels. There are discussions of “news deserts” and unchecked malfeasance by elected officials. At the same time, we’re seeing the rise of citizen journalists, the growth of organizations that harvest, enhance, and distribute an ever-expanding range of data on government activities, and the creation of new opportunities to share, discuss, and analyze information vital to civic engagement.

For the goals of achieving government transparency and effective self-governance, what has been lost and what has been gained in all these transformations? Is the net effect positive or negative, and what lies ahead? In this talk we’ll lay out the different arguments in this debate, and we’ll engage the audience in the conversation.

I was really impressed by the quality of the audience questions at both conferences, and their engagement with Twitter. Our talk generated over 40 tweets at Transparency Camp. Here are samples from both talks:

‏@MobileTrevor Result of losing local news is fewer voters, lower civic participation, increased corruption, etc says @mtoppa #TCamp13

@zpez how can you maintain local engagement after an acute issue is resolved? build stronger networks; tap into the ppl w/ the data #TCamp13

@_anna_shaw The ‘digital political baseball cards’ from @ElectNext are pretty darn cool… Gonna be playing around with these later. #TCamp13

‏@ianfroude Local papers dying, so ‘ppl have gained access to the world (intl/natl papers) but lost access to their backyard’ #TCamp13

@jmikelyons: Politicians know everything about us, we know little about them. The Big Data Divide. Big civic problem #bcni13

@emmacarew #bcni13 impressive: folks at @electnext are working directly with the mayor’s office to makes data not just available but accessible

Transparency Camp was the larger of the two – over 600 people attended. Some traveled quite a distance to be there. In our talk we had questions from people involved in the media from as far away as Poland and Uganda.

Both conferences had a great sense of community. Many of the conversations I heard around me were similar to conversations we have at ElectNext, about how to bring greater transparency to government activities, and making open government data accessible and useful. I also had an unexpected but very welcome encounter: while passing through a crowd I heard a nearby voice say “hey Mike Toppa,” and turned to see a face I hadn’t seen in over 10 years. It was a former co-worker from my time at HighWire Press. He works at the Sunlight Foundation now. It was great to catch up and compare notes on our work. After the conference, I also got to catch up with my old friends Pat and Emma, from my days at Georgetown.

Here are the videos for both talks. If you only have time for one, I recommend the TransparencyCamp talk (the first one below). Below the videos are my summaries of the sessions I attended at Transparency Camp.

Transparency Camp Notes

These are my own brief summaries of the talks I attended. Most sessions had note takers, and their notes are at the TransparencyCamp site.

  • Electoral districts API talk: this was an overview of different initiatives out there, and pros and cons of different approaches. If you use maps to determine districts, you can do things like determine a district from a geo-location. But you can’t disambugate things like apartment buildings that are split between districts, which is actually fairly common (often by odd/even apt numbers or by floor). This is called “packing” or “cracking”, depending on the goals of the gerrymandering (to either dilute or concentrate the voting power of a group of voters, and/or aid or hinder turnout efforts). District boundaries can also vary for state rep vs state senator, etc. At a technical level, using maps is easier. Addresses are harder because of the volume of data involved and you can’t rely on geo-location. Google is building up data based on addresses; most others are using maps.
  • A new project for city and state level engagement from opengovernment.org: they’re releasing a platform soon for facilitating citizen engagement with city councils, state reps, etc. It includes a petitioning system and lets elected officials register their own accounts, for direct online interaction with constituents. It also allows for entering info on legislation, etc, but isn’t a legislation management system.
  • “Municipal Open Gov efforts don’t scale down” – this was a discussion of the challenges of providing open gov in smaller cities, which don’t have the resources of big cities like Philly, Boston, etc. Short version: the only way to make this happen is to provide systems that help solve real city management problems (i.e. transparency for transpareny’s sake isn’t going to happen if it means creating more work for already overworked staff) and give those systems an open api, so openness requires no additional effort.
  • Tracking shadow campaign money: this was led by Robert Maguire from OpenSecrets. It was fascinating but depressing: after the Citizens United decision, it’s become almost impossible to track hundreds of millions of dollars in campaign money. He described a complex set of schemes involving phony non-profits and other front organizations where money is moved around repeatedly so it’s hard to track. The FEC and IRS requirements are so minimal now, it’s hard to tell where the money is coming from or how it is spent. But at Open Secrets they are able to give at least some top-level figures through IRS records, but often only a year after the fact. So they can get a rough sense of how much is being spent in total through this new shadow system, but they can’t get many specifics.

The 50 trillion dollar iPhone

Today, at the Agile Testing and BDD Exchange conference, Bob Martin mentioned an article in the EE Times about how microprocessors have changed the world. I looked it up, and the article uses a truly amazing example to make the point. Suppose it’s the late 1940s, and you want to build a device with the computing power of an iPhone. The most sophisticated computer at the time was ENIAC, which was powered by 17,468 vacuum tubes, had about 5 million hand-soldered joints, weighed 27 tons, and occupied 1800 square feet. A single iPhone contains about 100 billion transistors and weighs just under 4 ounces. Building the equivalent back then would have required:

  • Weight: 2,500 Nimitz-class aircraft carriers
  • Volume: 170 Vertical Assembly Buildings (the VAB is at the Kennedy Space Center and is the largest single-story building in the world)
  • Power: over a terawatt, requiring all the output of 500 Olkiluoto power plants (the largest nuclear power plant in the world)
  • Cost: $50 trillion (the economic output of the entire world in 2011 was about $70 trillion)

And now you can put one in your pocket.

Bob went on to point out a fascinating contrast to that exponential advance in computing power: just how little computer programming has changed. Languages have come and gone, but programmers are still writing if statements and while loops. What we think of as modern advances, like object oriented programming, were originally thought up in the 1960s.

Personally, I don’t see this as a problem. Programming languages are languages – they are forms of human expression. The world has changed in many dramatic ways since the time of Shakespeare, but we can read Shakespeare today and still relate to the motives, passions, and failings of the characters. Programming languages exist to communicate a painstaining set of instructions (and therefore aren’t as engaging to read as Shakespeare). But their domain is still that of human expression, for communicating often astonishingly subtle, complex, ever changing, and sometimes seemingly contradictory needs. So, to me, it’s perfectly logical that, while syntax and techniques may be refined over time, the fundamental aspects of programming languages today would be much more familiar to a programmer from the 1950s than the incredibly small and powerful devices in which they now run.

WordCamp San Diego

@theandystratton presenting "Accomplish it with Core: Galleries, Sliders and More" at WordCamp San Diego@theandystratton presenting "Accomplish it with Core: Galleries, Sliders and More" at WordCamp San Diego
@theandystratton presenting "Accomplish it with Core: Galleries, Sliders and More" at WordCamp San Diego24-Mar-2012 11:31, Canon Canon PowerShot SD780 IS, 5.8, 17.9mm, 0.05 sec, ISO 500
 
@norcross is out of uniform for his presentation "Stay Classy, WordPress" at WordCamp San Diego@norcross is out of uniform for his presentation "Stay Classy, WordPress" at WordCamp San Diego
@norcross is out of uniform for his presentation "Stay Classy, WordPress" at WordCamp San Diego24-Mar-2012 18:26, Canon Canon PowerShot SD780 IS, 5.8, 17.9mm, 0.05 sec, ISO 800
 
WordCamp San Diego Developer Day, at CoMergeWordCamp San Diego Developer Day, at CoMerge
WordCamp San Diego Developer Day, at CoMerge25-Mar-2012 14:35, Canon Canon PowerShot SD780 IS, 3.2, 5.9mm, 0.033 sec, ISO 250
 
@tweetsfromchris takes on Nicky Rotten's 2.5 lbs. burger challenge (with a gigantic side of fries)@tweetsfromchris takes on Nicky Rotten's 2.5 lbs. burger challenge (with a gigantic side of fries)
@tweetsfromchris takes on Nicky Rotten's 2.5 lbs. burger challenge (with a gigantic side of fries)25-Mar-2012 21:34, Canon Canon PowerShot SD1400 IS, 2.8, 5.0mm, 0.017 sec, ISO 800
 

This was my second WordCamp, and my first not as a speaker. When I presented at WordCamp Philly last Fall, I was blown away by the positive energy of everyone there (which is one of the things that led to my current position with WebDevStudios). WordCamp San Diego was just as much fun, and there was plenty to learn too. Coming from Philly means it’s a long way to go for a WordCamp, but WebDevStudios was a sponsor, so several of us from the company went. Since we are a virtual company, I also met a couple of my co-workers in person for the first time – @tweetsfromchris and @TobyBenjamin

WordCamps typically have 2 simultaneous tracks – one for developers and one for users. They also provide an opportunity for these two parts of the WordPress community to come together, so online businesses can find good developers, and for developers to find rewarding projects.

I stayed in the developer track for all but one presentation, and they were all excellent. WebDevStudio’s own @williamsba presented on how to configure and use WordPress multi-site. Even in the more introductory-level sessions, where I thought I’d already know everything, I actually learned a lot. The vibrancy of the WordPress community, and the dedication of the speakers, who appear without compensation, continues to impress me.

The “spring training” theme was really well done, from the matching baseball jerseys for the speakers, to the web site, stickers, and, of course, the cake. @norcross gave his whole talk as Ron Burgundy (yes, in his boxers), which was hilarous enough to justify him being the only speaker out of uniform.

The after party was a blast. It was my first experience where it was socially acceptable to both drink and have endless conversations about code and WordPress. I have found my people 🙂 and it was great to meet @housechick, @jaredatch, @matthewjcnpilon and @i3inary.

The 2nd day of the conference was a developers’ day, held at the very sleek Co-Merge workplace. This was similiar to the developers’ day at WordCamp Philly, with some short presentations, but the focus was more on people making connections and helping each other code.

The one challenge for me was sleep. WebDevStudios rented an apartment since several of us were there. The first night there was a party happening in an adjacent unit, and the thumping bass didn’t stop coming through the floor until about 3AM. The next night someone was shot and killed right outside our apartment, and the last night one of my co-workers had to get up and leave really early for his flight. But I’m not so old (yet) that I can’t handle it (actually, having kids has conditioned me to handle sleep deprivation better than I did years ago).

My next WordCamp is in just a few weeks. I’ll be speaking at WordCamp Nashville, on how to apply dependency injection techniques to WordPress plugin development.

I took pictures throughout the day – here’s the complete album:

2012 - WordCamp San Diego
2012 - WordCamp San DiegoMar 23, 2012Photos: 14
 

The Bane of WordPress Plugin Development: register_activation_hook

If you’re developing a WordPress plugin, any initial, one-time setup work your plugin needs is done through a call to register_activation_hook() (such as registering settings, or creating a database table). Debugging problems with code you call through this hook is notoriously difficult. I think I’ve pulled my hair out over each one of the many things that can go wrong, so I thought I’d share my hard-earned solutions (these all apply to the current version of WordPress, 3.0.4):

  • You get a “Cannot redeclare” error when activating your plugin from the plugin menu: this may tempt you to put a if function_exists() wrapper around your function (or if class_exists() around your class). But don’t waste your time – all it really means is that there was an error of some kind. Any kind of logic error within your function (such as an incorrect number of arguments to a function call) will result in this error being displayed. Don’t be fooled.
  • You try using echo or print to debug, but never see any output: normal output isn’t shown during plugin activation. If you want to see some debugging output, call trigger_error() instead. This will force your message to be displayed in the plugin activation status box.
  • It says “plugin activated” after you activate from the plugin menu, but nothing actually happened: this typically means there’s something wrong with your arguments to register_activation_hook, and it just fails silently. The first argument must be the path to your main plugin file (i.e. the file with the plugin comments at the top – __FILE__ will do fine), which is where your activation function must be. The existence of this argument is deceptive, as it suggests you could put the function in another file (you could, I suppose, put your call to register_activation_hook in another file, and then put your callback function in the main file, but I can’t think of a good reason to do that). The second argument is the function name – if you want to get fancy, it’s fine to use callback pseudo-types.
  • You’re having trouble using a global variable: this limitation is actually documented.

Two general guidelines I recommend are:

  1. Temporarily put an “activate” button on your plugin’s settings page, and have it call your activation method. This will allow you to separate any problems with your activation function from any problems that may exist in how you’re calling it through register_activation_hook. This is a good way to expose errors that are otherwise hidden behind the “cannot redeclare” error.
  2. Write unit tests and do test-driven development! This will give you a way to verify the functionality of your code as you work on it, and will let you know immediately if you break anything. I’ll have an upcoming post on how to use SimpleTest with WordPress.

Displaying WordPress Posts by Category, Even If They’re Not Recent

A couple years ago I wrote a post describing how to limit posts on your WordPress home page to ones in specific categories. A limitation of this approach is that it can only show recent posts that are in The Loop. Others have solved this problem as well, but all the solutions I’ve seen have this same limitation. What if posts in the categories you specify aren’t among your recent posts? Your home page would show no posts!

This was a problem for my new home page design. I’ve divided my site’s content into 3 major topics, and I show hyperlinked titles for the 3 most recent posts in each topic. But it’s possible that a topic may not have any posts among the most recent 10 posts, which is all The Loop knows about (10 is WordPress’ default setting for how many posts to show on your home page). So I want to get the most recent 3 posts for each topic, regardless of whether they happen to be in The Loop.

To do this, I created the following function and put it in my theme’s functions.php file. Note that I’m using a straight SQL query, which means this is not guaranteed to work in future versions of WordPress (the WP coders do a good job of maintaining a consistent programming API across versions, but they do change the database sometimes).

function get_top_category_posts ($term_ids) {
    global $wpdb;
    $top_3 = '';

    $results = $wpdb->get_results("select ID, post_title, post_date from wp_posts p
            inner join wp_term_relationships r on p.ID = r.object_id
            inner join wp_term_taxonomy t on t.term_taxonomy_id = r.term_taxonomy_id
            where t.term_id in ($term_ids) and p.post_type = 'post'
            order by post_date desc limit 3", ARRAY_A);

    foreach ($results as $result) {
        $top_3 .= "<li>" . date("M d", strtotime($result['post_date'])) . " - "
            . '<a href="' . get_permalink($result['ID']) . '">'
            . $result['post_title'] . "</a></li>\n";
    }

    return $top_3;
}

I then call it like this from my custom home page (where I’m not using The Loop at all):

<ul>
<?php echo get_top_category_posts('6,12,89,90,98,105,106,115'); ?>
</ul>

I’m passing the IDs for the categories I want. The easiest way to get a category ID is to go to its edit screen and note the cat_ID in the URL. This works for tag IDs also.

There are 3 database tables involved in the query: wp_posts contains your posts, wp_term_taxonomy contains the term IDs for categories (and tags), and wp_term_relationships connects the posts to their categories. Note that a cat_ID or tag_ID you see in an edit screen URL is actually called a term_id within the database. In this case I’m only retrieving the date and title of each post, but you could retrieve the entire post if you want (see the description of the wp_posts table).

If you want to simplify the query to improve performance, you can eliminate one of the table joins if you manually look up the term_taxonomy_id that corresponds to each category’s term_id, and pass those instead:

select ID, post_title, post_date from wp_posts
        inner join wp_term_relationships on ID = object_id
        where term_taxonomy_id in ($taxonomy_ids) and post_type = 'post'
        order by post_date desc limit 3

The downside of doing this is that every time you add a new category or tag, you’ll have to go into your database and look up its term_taxonomy_id. So I don’t recommend doing it this way unless you’re comfortable poking around in mySQL.

(Note to fellow cranky programmers: I know this is poorly abstracted – the need here is simple enough that, to me, abstraction didn’t seem worth the trouble).

%d bloggers like this: