Google graphs and graphs for code search

A graphing calculator

In 2012, Google discontinued its highly-regarded code search. It was freely available for open source projects residing on Google Code project hosting. This is the only description of the pre-2012 Google Code Search I could find:

…it was designed to help people search for open source code all over the web [and] will be shut down along with the Code Search API on January 15, 2012.

Simultaneously, a graphing calculator functionality was introduced to regular search.

Now, when you type a function, you’ll see it graphed… You can graph more than one function at a time by separating them with commas. Once the graph is drawn, you can zoom and pan to see the sections and details you want. And the Google colors are a nice touch.

One doesn’t quite replace the other.

Surprise! It’s BACK

Google Code Search has returned! Google Developers provide a reference document and the presumably comprehensive User’s Guide. All the user’s guide pages are dated March 12, 2020.

Prior to 2020, Google Code Search was an internal tool, accessible to employee developers only. On April 1, 2020, Google announced that it would make a version (“same binary, different flags”) available to open source communities as well. There were no April Fools Day activities this year, due to the pandemic. I might have been a bit suspicious otherwise, as Google has a long history of fun on that day.

The new tool uses a custom search language and supports regular expressions. It does text search, although I do not know if it is language-agnostic.

Graphs for search

Some of the repositories for which the new code search is supported have cross-references. Cross-references use another Google tool called Kythe.

Kythe sounds complicated! It converts a compiled form of the code repository from data to a graph, eventually generating a graph schema representation. There’s a lot more. The documentation probably explains it.

InfoQ reported the Code Search announcement, noting these limitations:

Code Search does not give access to the real repositories used at Google. It just exposes indexed versions of those repositories to make their content available through search. Additionally, the public Code Search interface does not include all features provided to Google engineers, including automatic code analysis and linting, code coverage, fuzzing integration, and so on.

Not language agnostic

After reading a little more, I confirmed that these are the only repositories for which one may use Google Code Search.

  • Angular
  • Bazel (with cross-references)
  • Dart
  • ExoPlayer
  • Firebase SDK
  • Flutter
  • Go (with cross-references)
  • gVisor (with cross-references)
  • Kythe (with cross-references)
  • Nomulus (with cross-references)
  • Outline
  • Tensorflow (with cross-references)

How is version 1.0 different than version 2.0?

I don’t know.

Thanks to this old StackOverflow question, I found a chronology of the first Google Code Search, which was developed by Google employee Russ Cox in 2006 when he was a Google summer intern. He is rsc on Github. Buried in the comments on that old StackOverflow question was the URL for rsc’s write-up of his work, titled Regular Expression Matching with a Trigram Index or How Google Code Search Worked. Its essence is running fast indexed regular expression searches on small files. It is dated 2012, so maybe rsc wasn’t allowed to publicly post development documentation for Google Code Search 1.0 until it was retired.

I suspect that rsc is involved in Google Code Search 2.0, as I see his name all over the Github repository containing the code for the custom search language.

I wonder if the scope of use for the new Google Code Search is less than for Google Code Search 1.0, even if it is more sophisticated, i.e. Kythe. rsc would know the answer to that.

Leave a Reply

Your email address will not be published. Required fields are marked *