Amazon Archival

Well-capitalized Seattle start-up seeks Unix developers

Well-capitalized start-up seeks extremely talented C/C++/Unix developers to help pioneer commerce on the Internet.  You must have experience designing and building large and complex (yet maintainable) systems, and you should be able to do so in about one-third the time that most competent people think possible.  You should have a BS, MS or PhD in Computer Science or the equivalent.  Top-notch communication skills are essential. Familiarity with web servers and HTML would be helpful but is not necessary.

Expect talented, motivated, intense, and interesting co-workers.  Must be willing to relocate to the Seattle area (we will help cover moving costs).

Your compensation will include meaningful equity ownership.

Send resume and cover letter to Jeff Bezos:

mail:            be…
fax:             206/828-0951
US mail:         Cadabra, Inc.
                 10704 N.E. 28th St.
                 Bellevue, WA  98004

We are an equal opportunity employer.

“It’s easier to invent the future than to predict it.”  — Alan Kay


Cite Something

See something? Cite something.

… [we] had plenty of experience with our work making its way across the internet, both with and without attribution to our hard work, and so we thought we would make a comic/chart about how to do it right. Consider this reference material…


If you see something you want to use on-line, have a look at Not Quite Wrong’s Attribution Flowchart, or a similar guide. Please click on image to view full-size.

Attribution Flowchart by & Click on image to view full-size.

Don’t plagiarize!

That may be easier said than done. Creative Commons (“CC”) licensing simplifies matters significantly. Yet CC licenses, are optional, and while frequently specified, certainly not universal.

Prior to the age of digital publishing, this was not such a widespread concern. Print media channels had in-house counsel and other persons trained in the rules of fair use and content re-purposing. 

Attribution is preferable to Retribution

I felt like I bore the Mark of Cain after an incident due to one of my very first blog posts.

I used two photos of snails for a brief post on the beauty of spirals in nature. I noticed the snails on the Flickr photo site. My account on Flickr was nearly as new as my WordPress blog site. I did read Flickr’s Terms of Service for photo reuse. This was possibly the first TOS for a website I’d ever read, other than the TOS for Second Life! 

A few days later, I found myself the subject of scrutiny, communicated quite eloquently by the abrupt increase in page views indicated by WordPress’s statistics plug-in!

In those days, I wasn’t aware of the more comprehensive tracking available via Google Analytics. Thankfully.

The snail photos belonged to a skilled amateur photographer. She was a mature woman fluent only in languages with which I wasn’t familiar.

My initial blog post did not give appropriate attribution to her.

I amended it, but did not do so correctly under the terms of the CC license. No good.

I tried again. The photographer was not quite satisfied. Which she expressed as best she could.

I felt so guilty, misusing this woman’s work! Finally, I got it right after a third revision.

Fortunately this was prior to the additional complications of Getty Image licensing, which is a default or opt-out for the entire Flickr site.

Image sharing is particularly confusing

Here are other questions, about general practices. I was unsure if a watermark on a photo meant that:

  • I was not allowed to re-use it under any circumstances, or
  • The watermark alone was sufficient attribution and no other details should be given.

Although I now know that the second condition is inadequate, I’m still uneasy and uncertain about re-use of watermarked images.

The two leading image sharing sites, Flickr (owned by Yahoo!) and Picasa (a Google product) offer an All Rights Reserved designation, as well as all varieties of CC licenses. Yet I sometimes observe All Rights Reserved images reproduced.

I’ve read the Flickr help pages and remain confused regarding fair use. Does All Rights Reserved override the basic Terms of Service of the Flickr site, which states that all publicly viewable images on Flickr can be reproduced as long as the image links back to the site with appropriate attribution? And what of the new Getty Image licensing?

Fail-safe way to cite right

When in doubt, send a quick email or IM to the artist or photographer. Most reply very quickly. I’ve asked artists for permission to reproduce their work despite the All Rights Reserved or copyright sign. Most have given permission immediately, or within a few hours of visiting my sites and confirming that they were non-commercial.

My single experience with a private foundation’s photos was only slightly more involved. A curator wanted to preview the content and placement of the image prior to posting– a reasonable request. In every case, I was allowed to post images free-of-charge.

Yes, attribution and/ or obtaining permission is the right thing to do. It’s also personally worthwhile, for the sake of your peace of mind! 


eDiscovery and demise of News of the World

A new use case for text analysis is emerging in the legal field. It is referred to as eDiscovery. Such methods are not widely accepted, let alone implemented as yet, but they are receiving increasing amounts of attention.

What is eDiscovery?

eDiscovery is a platform, combining algorithm, software and productivity tools. It is most obviously useful for expediting in-house legal document retrieval. I learned of the existence of eDiscovery quite recently. Inside Counsel gives this definition as part of an 8 June 2012 post on the limitations of eDiscovery:

eDiscovery offers search methodologies to rein in time spent on electronic document review. One strategy is “computer assisted review,” also known as “predictive coding” or “predictive analytics.” Predictive analytics is the nonspecific term for a computer program that uses algorithms to sample and predict relevancy across large collections of electronically stored information.

Both terms, “predictive analytics” and “predictive coding”, were confusing to me. The terms are similar to ones used in quantitative analysis. They may almost be considered as applications of the same methodologies, but in a legal context. There is a greater emphasis on text though. There are other details which I haven’t read enough about, thus cannot hazard a better guess as yet.

Further refinement needed

According to a 2010 Duke University survey of major companies (via the same Inside Counsel article), emphasis all mine:

The expense of electronic discovery is the most rapidly increasing item in the average litigation budget… This growth in e-discovery expenses is even more alarming [because] there is no evidence that it has resulted in a corresponding increase in the volume of relevant or important material being produced in litigation.

An Analysis of Hackgate

eDiscovery can be exciting, especially when it is about the recent demise of ‘News of the World’ (‘News of the World’ is the much publicized and scandal-ridden Rupert Murdoch flagship publication). Here’s the premise:

What if the analysis were to have been approached with an eDiscovery-enabled perspective?


Especially useful curation

A list of uncommonly useful links and news items by an uncommonly astute person, Greg Linden (formerly of Amazon search in the early days) follows below. This is the best of all worlds: Having access to someone who has superior insights due to field of expertise, is reasonably articulate, and is willing to share without ulterior motive or bias.


Researcher Mapping is a search engine for peer-reviewed research publications, with an overlay of geospatial data. Repositories include Springer-Verlag (Springer Science+Business may be the latest official name), PubMed and BioMed Central. Springer is the developer of Author Mapper.

Perhaps the most impressive feature to me was the historical time span covered. For example, I found three of my father’s research publications, one of which was written the year of his graduation from medical school, a very long time ago.


Chart art

Edward Tufte’s first text, The Visual Display of Quantitative Information, introduced standards for graphical representation. It is considered the definitive guide for visual display of complex data.

UPDATE 4 September 2014


What once was old

Ever heard of Telex?

I have. It’s old. Or was. Not anymore. Telex is the term being used to describe an experimental system for proxy-less access to the internet. It is based on that mouthful of a word, “public key steganography”.

I first saw the topic mentioned while reading an InfoSecIsland post earlier today. This is a comment from the University of Michigan researcher who developed Telex:


Idea for a very open ID

Be receptive! Be open to each and every type of user input for authentication.

This very user-centric approach for identity resolution leverages the many open API’s now available for web services. Feel free to select your user name-of-choice!

  • @Twitter user name
  • name
  • user [email protected]
  • name
  • user or user blog URL
  • name
  • user [email protected]
  • Open ID provider URL
  • more?

In his identity resolution related post, developer Luis Farzati emphasizes that:

the objective is to allow the user to input whatever wanted [in order] to login… If it exists as a valid username out here, we’ll find it and suggest it!


Fear and loathing of HTML5

Part 1: Conflict

I wrote a post about development of the new HTML standard yesterday. Quite a conflict, a public one, is in progress between Internet standards groups, WHATWG and W3C, and to some extent within WHATWG itself. 

Ian Hickson, project leader with WHATWG a.k.a. Hixie, is catching much of the blame for being autocratic. But some of that is unfair, as he is also advocating virtues of expediency and compromise to market demand and user needs. See this Google+ discussion on HTML standards definitions, led by Ian. 

Also see a remarkably honest, and humorous post from Ian’s personal web page, Hixie on Handling People, which is being circulated, with defamatory effect. I think that is unfair and unwise. If Ian’s leadership is a problem, this is not the way to remedy it. The entry from his personal webpage is harmless. It shows he has realistic insights into how individuals and groups interact. 

Regarding HTML, Hixie should be aware that bowing to user demand is not always the best choice when developing standards. That’s why there ARE standards! Short term pain must often be endured for longer term benefits. An analogy is financial regulation. No one likes the SEC and internal audit departments when they impose restrictions that seem disruptive to market participants with a narrowly focused point of view. In the long-term, it is usually beneficial to everyone.

There is a difference between securities regulators, who have power of legal enforcement, and WHATWG, though! Yes, W3C and WHATWG DO have more power than I did as a Data Governance manager at TriCare. But they don’t have the ability to impose sanctions, like the U.S. Securities and Exchange Commission does. If WHATWG or W3C standards are perceived as impractical or too costly, industry WILL circumvent i.e. ignore the standards. But I don’t think Ian should say that technical implementation defines the standards, in effect. That will lead to nothing but grief and short term thinking, and hurt the industry ultimately. 

Accessibility standards are another concern, alluded to with extreme delicacy (so subtle I’m not even certain) in the Google+ post. They are personally important to me, as a user. But again, it is a matter of benefiting the few and sacrificing for the many, in terms of delayed development time. Better accessibility is necessary, was needed for a long time. But forty years of past neglect can’t be remedied at once.

Part 2: Implementation Success Story!

Kaazing is associated with the HTML5 Doctor mentioned in my post, see above. The following is a delightful presentation by Kaazing from a conference in October 2011, using Prezi instead of Slideshare or PowerPoint. Reproduction is allowed here under CC License 3.0.


Why is Amazon Legal Dept accessing my Gmail again?

Activity on this account
This feature provides information about the last activity on this mail account and any concurrent activity. Learn more
This account does not seem to be open in any other location. However, there may be sessions that have not been signed out.