Categories
Uncategorized

War on Content Farms Now in Progress

Google Declares War on Content Farms:

Google has announced a major algorithmic change to its search engine. Impact on users will be subtle while dramatically improving the quality of Google’s search results…

Google is targeting content farms.

This update is designed to reduce rankings for low-quality sites — sites which copy content from other websites or sites that are just not very useful…. It will provide better rankings for sites with original content, such as research, in-depth reports, thoughtful analysis and so on.

The change should make it easier to find high quality sites.

Google did not give details of the change, which should impact 11.8% of Google’s queries (currently only in the U.S., with plans to roll it out elsewhere over time), but it does say that it will affect the ranking of many sites on the web.

The list of related articles I have hand selected (just like I dredge through string beans in order to find the very best ones) may be of further interest to those with a sense of humor. Or without a personal stake in content farming.

Related Articles
Categories
Googleplex

How To Use Google To Search

How To Use Google To Search

Seems like it would be obvious, doesn’t it? Usually it is, but type of search and query syntax require a bit more knowledge. If in need, they are worth the extra effort.

So many choices! Which Google search to use?

Start with regular Google search. By “regular”, I am referring to the bread-and-butter of search engines, Google universal search. You can find search syntax here, How to use Google universal search. That Search Engine Land post describes specific Google search options, by file type and subject matter, such as Image, Video, News, Shopping, and Travel, see below.

Google universal search screen shot example

Google Universal Search

This detailed guide maintained by Google describes the meaning of each item returned on the search results pages. It is a great resource!

Try Google Blog Search and Books too. [Update: Blog Search has been discontinued.]

Google search

Special searches and more

For more subtle or granular inquiries, Google provides special searches. I wrote a detailed post on Google special search syntax a few weeks ago.

Google maintains a search resource page of documents for localization help, page removal from the index, and all sorts of specific user support FAQs.

Categories
Uncategorized

Mailhide

f you’ve ever looked at an open-source development project hosted by Google servers, usually on  http://code.google.com sites, Mailhide will be familiar. It is a less well-known application of the reCAPTCHA detection challenge.

Mailhide conceals part of an email address

This is how it prevents spammers from accessing email addresses using automated programs. Typically, the first few letters, or numbers, of the username part of the email is visible, followed by an ellipsis i.e. three dots, and then the domain name.

Most Google employees* use Mailhide. Mailhide is offered as an option to developers using Google Code sites.

Mailhide type functionality is also offered by Slashdot for user accounts. Slashdot is not necessarily using Google reCAPTCHA for encryption, however. There are other Turing tests besides reCAPTCHA.

reCAPTCHA is a Google product. It was not developed by Google, though. Google purchased the reCAPTCHA algorithm from Carnegie-Mellon University a few years ago, in 2008.

reCAPTCHA Mailhide API

Are you running a web application that lists users’ email addresses? Do your users a favor by shielding them from spam with reCAPTCHA Mailhide.

Google will give you an API (cryptographic) key. Use it to encrypt user email addresses. Google supplies full documentation for the Mailhide protocol. Everything is free of charge.

I am uncertain whether API restrictions on usage apply. That is a familiar restriction for applications developers relying on the Twitter API. It should not be a binding constraint in this case, as Mailhide is far less transactional that Twitter. Unless one is very, very popular!

reCAPTCHA comes in many flavors!

Libraries are available for PHP, Perl, Ruby and Python programs.

*Google employee accounts in the U.S.A., and many but not all other countries, have the format  [email protected].  Non-employee Google mail accounts are  [email protected].

Categories
Googleplex

Viral Search and Analysis on the Social Web

Google is making inroads into the field of social search. However, there are alternative providers that specialize in that field that are already well-established. One such search engine is PeopleBrowsr.

Similar to how Google has indexed the web, PeopleBrowsr has indexed Twitter:

With Twitter’s Firehose and our proprietary server technology, we have reliable access to over 3 years of data.

PeopleBrowsr recently introduced a social search engine that has the potential to carve its own niche in the space where Google’s search algorithms and simple Twitter activity trackers intersect.

ReSearch.ly

The new search engine is brand named ReSearch.ly*. PeopleBrowsr has designed Research.ly for “online discovery analysis and interaction”.

Research.ly is for consumers, brand marketers and researchers. Its goal is to

build advanced conversation technologies to assemble the collective intelligence through storing, retrieving and indexing every public human conversation. Now at this pivotal era of digital preservation in social media, we’re releasing 1,000 days of Twitter data – free of charge – for deep historical reporting and social search.

ReSearch.ly differentiates itself by offering these four tracking and analysis functions:

  1. The Interest Graph– Access by topic and keyword
  2. Degrees of Separation– A relationship mapping tool to discover the relationship between any two Twitter users
  3. Community Search– drill down searching for user subsets with one or more common attributes
  4. Location-based Search– drill down search within a geographically targeted user group.

The new service’s corporate motto is “Instant Communities In Real-Time with Viral Analytics and Viral Search”.  As of now, it seems to focus exclusively on Twitter stream content.

*Yes, that is correct. Research.ly operates under the auspices of the Libyan Government, as .ly is Libya’s ICANN-assigned top-level domain.

Categories
Googleplex

Google imagery and logos

Early days

Early Google doodle

Burning Man Google logo

One of the earlier Google logos was motivated by the annual Burning Man festival. This logo was attributed to Google founders Larry Page and Sergey Brin.

According to Google,

Google and the revised logo was intended as a comical message to Google users that the founders were “out of office.”  While the first doodle was relatively simple, the idea of decorating the company logo to celebrate notable events was well received by users.

Spring

Spring 2008 Doodle

Doodles

These modified versions of the Google logo were called Google doodles.

Google doodles that appeared in the years that followed included a celebration of the arrival of Spring and a Mars Rover landing commemorative image, with two rather cute little blue-ish green aliens sitting on the second “g” of Google.

Mars Rover

Google Doodle commemorates the Mars Rover landing

How many doodles has Google done over the years?

The doodle team has created over 300 doodles for Google.com in the United States and over 700 have appeared on international domains.

Braille commemorative doodle

Google in Braille

Where can I see all the Google doodles that have been done over the years?

All doodles can be found on the Google logos page.

How I can take part?

I love Google and have designed my own doodle. Where can I send my fan logo?  How can Google users and the public tell Google about ideas for future doodles?

You can send your fan logos and requests for doodles to [email protected].

What’s your process for selecting doodles?

Generally, we look for non-denominational doodles that are fun and quirky from a variety of categories, such as those that celebrate the lives of artists and inventors.

Gropius Google doodle

Remembering architect Walter Gropius

A rather special, interactive Google doodle was introduced this year, in honor of the 183rd birthday of Jules Verne. It ran for one day last month. The Trend Hunter provided very thorough coverage, for those who are curious.

Doodle 4 Google

Doodle4Google is a yearly contest held for children up to 18 years of age. The deadline for this year’s contest submissions was March 1, 2011. Winners have not yet been announced. More information about the contest requirements and past winners is available on the Doodle4Google history page.

Categories
Googleplex

Google Desktop supports 64 bit Windows

I just unearthed yet another Google product that has been in existence since 2009, or longer, that I had never heard of before today!

Google Desktop

Google Desktop search sidebar

Google Desktop Search

As I was browsing around the National Public Radio (“NPR”) online site, looking for Windows 7 supported Google Gadgets this morning, I found a reference to a Google Gadget for 64-bit Windows. This led me to a blog post dated over a year earlier. Google was far ahead of me.

Google Desktop allows one to use Google search for one’s own desktop, and not be restricted to web browser search only. Apparently, Google Desktop support for 64 bit Windows was available as of July 2009!

I do not know if it applies to Windows XP, Visa and Windows 7 operating systems, as the blog post didn’t specify. If not, Windows 7 support is probably available by now.

It was no surprise to find that the official source for information about Google Desktop is the Inside Google Desktop blog on Blogger. Google is very consistent with its product coverage strategy and data governance policies!

UPDATE

I confirmed today that Google Desktop does support 64 bit Windows XP and Windows 7.

Google Desktop has expanded more than I realized. Google Desktop search is available for users of Linux and Mac OS X too.

Categories
Googleplex

Blogger now supports dynamic views

Blogger became dynamic! Sort of. Blogger introduced five new views for Blogger  *.blogspot blogs yesterday. I was pleased to see the Tumblresque mosaic style was one of them.

Blogger sign in page

There are two separate versions documenting this new functionality, for blog readers and blog authors.

How to use dynamic views

The basic idea is to access the dynamic view through an extended URL. For example, I have a Blogger blog with URL http://ellieaskswhy.blogspot.com.

To view my blog so that I can choose any of the five dynamic views, visit http://ellieaskswhy.blogspot.com/view

If you already know which view you want, and wish to see it directly e.g. the mosaic view, visit this URL instead   http://ellieaskswhy.blogspot.com/view/mosaic

*Blogger is a Google product. Google acquired it about four years ago, and is now making some much-needed upgrades like this. Dynamic views are supported in modern browsers only. They will not work without HTML5 support, which is included in Chrome and Safari browsers, as well as Internet Explorer version 9.0 and the latest version of Firefox (probably 4.0).

Categories
Googleplex

reCAPTCHA definition and history

What does a CAPTCHA do?

Humans can read the distorted text in CAPTCHA challenges* but current computer programs cannot.

A CAPTCHA is a program that protects websites against bots by generating and grading tests that humans can pass but current computer programs cannot.

What does CAPTCHA mean?

CAPTCHA is an acronym for Completely Automated Public Turing Test To Tell Computers and Humans Apart. It was coined in 2000 by Carnegie Mellon University computer science research staff who invented CAPTCHA originally.

What is the difference between CAPTCHA and reCAPTCHA?

This is how the reCAPTCHA Project explains the difference:

ReCAPTCHA helps prevent automated abuse of your site (such as comment spam or bogus registrations) by using a CAPTCHA to ensure that only humans perform certain actions.

Generally a CAPTCHA is a single word, whereas a ReCAPTCHA is two words. The reCAPTCHA project page explains this in greater detail. There are research papers, in *.pdf format available for download on the Google ReCAPTCHA website.

Google purchased CAPTCHA in 2009 and describes usage and further background on reCAPTCHA FAQs:

ReCAPTCHA is a free CAPTCHA service that helps to digitize books, newspapers and old-time radio shows.

ReCAPTCHA is free

While free to use, including the API, be aware that ReCAPTCHA is not open source software.

Other uses

ReCAPTCHA is best known for historic text digitization and spam filtering, which is an information security measure.

Answers to reCAPTCHA challenges are used to digitize textual documents… a combination of multiple OCR programs, probabilistic language models, and the answers from millions of humans on the internet, reCAPTCHA is able to achieve over 99.5% transcription accuracy at the word level….

OCR is an acronym. It means Optical Character Recognition. Compare the accuracy of standard OCR versus reCAPTCHA transcriptions of a medium quality scanned document on the reCAPTCHA digitization accuracy website. See some humorous reCAPTCHA examples from the official Google reCAPTCHA blog. Google announced an audio version of reCAPTCHA in 2009.

MailHide is another application, where potential for spam is reduced by requiring a reCAPTCHA challenge in order to disclose an otherwise partially obscured email address. More details are available in my post about MailHide from last month.

Recent developments

Recent research in the area of computer security led to some surprising discoveries about CAPTCHA and spam. Initially, it appeared that the CAPTCHA challenge had been defeated on a large scale, but localized very regionally. That was not true though. Human interaction of an unanticipated sort was still required to evade the CAPTCHA, on each and every spam comment and email that got through.

*Work continues on the original CAPTCHA project.

Categories
Googleplex

Chrome Developer Tutorial

This is an excellent tutorial for learning how to use the Developer Tools in the Google Chrome Browser. Hyper-link is to the Official Google Groups Site for the Open Source Chromium Project, not to a third-party provider!

Check your webpage! Find errors! Reduce Page Load Times!

Learn how to use EVERYTHING: Elements, resources, scripts,timelines, profiles, storage and the console. Once you learn the how to use the developer tools, you own the keys to the kingdom.

There are some salient points, for which I selected four articles to more completely explain. Note that the article order does not reflect on Zemanta, as the choices were not based on priority of relevance to web development. They were idly selected in this order according to my own idiosyncratic whims.

Click to Play

Initially, I thought this might be a fun game. I was wrong. It refers to in-line advertising links and videos in general.

Sand boxing

This is important!  Recall that Google Chrome browser has Adobe’s Flash application as a built-in feature. These features are called Chrome Extensions. Chrome and Adobe offer a “sandbox” for Chrome’s Flash component. A sandbox is a circumscribed “safe” area where a developer can do testing, without mishap e.g. crashing the browser. This is also useful for non-developers who might want to “contain” their Flash usage due to a temporary concern about security.

Web Design

This provides more information about the Chrome Extensions* which is the subject of sand boxes, see above.

Browser Cache

This will reveal the location of the elusive browser cache on your computer. It is found easily for other web browsers, in the Options menu for Internet Explorer is the first example that comes to mind. The cache location is not nearly as obvious for Chrome. I need to read the article in fact, as I have no idea where Chrome is storing my browser cache.

Related Articles: Selected by me from the Zemanta suggestions:

* There are many Chrome Extensions available. Each adds to the memory usage by the browser. If you load up too many, you can definitely hamper Chrome’s delightful responsiveness. That requires a certain effort.

Extensions will be a topic for a separate post.

Categories
Googleplex

Quality-of-Life in the Chrome O/S Cloud

Google Web Toolkit (“GWT”) is a productivity tool for developers. It is a

development toolkit for building and optimizing complex browser-based applications. GWT is used by many products at Google, including Google AdWords and Orkut. It’s open source, completely free, and used by thousands of developers [worldwide].

What programming language would be the most accessible for Google Chrome O/S apps development?

These are the existing constraints:

  1. Android apps are coded in Java.
  2. Chrome browser apps are JavaScript.
  3. A Java programmer can use a web toolkit to “translate” Java into JavaScript.

However, it will be more difficult to go in the other direction. That is, a PHP programmer can create JavaScript apps for Chrome browser. But Android apps require knowledge of Java. This is the reverse of item 3 (above), and is much more challenging.
Perhaps there is a unified language for both scripting as well as programming the core functionality of the app?

GWT Logo

Google Web Toolkit does that!

GWT certainly lets you write Java apps, then compile them into JavaScript. And it might get even better!

How? With a consolidated toolkit, based on GWT. Such a consolidated toolkit could be used to write an Android app that also works on Chrome O/S as a web app, without the need for coding in Java, only in JavaScript