Search concepts, not keywords, IBM tells business
IBM plans to give away key search technologies for corporate data retrieval that use concepts and facts instead of simpler "keyword" searches relied upon by consumer Web companies such as Google Inc., the world's largest computer company said on Monday.
While simple but powerful keyword searches have revolutionized how Internet users locate and retrieve information, IBM is looking to transform how office workers sift through the piles of data stored inside organizations.
"I don't see any of the major players moving into this area," Arthur Ciccolo, head of search technology at IBM Research, said of how major consumer Internet search companies such as Google, Yahoo Inc. and Microsoft have focused on the public Internet instead of private record data retrieval.
IBM plans to openly offer other software developers its Unstructured Information Management Architecture (UIMA), a technology that can analyze text within documents and other media to understand latent meanings, relationships and facts.
We Are the Web
The Netscape IPO wasn't really about dot-commerce. At its heart was a new cultural force based on mass collaboration. Blogs, Wikipedia, open source, peer-to-peer - behold the power of the people.
Ten years ago, Netscape's explosive IPO ignited huge piles of money. The brilliant flash revealed what had been invisible only a moment before: the World Wide Web. As Eric Schmidt (then at Sun, now at Google) noted, the day before the IPO, nothing about the Web; the day after, everything.
Computing pioneer Vannevar Bush outlined the Web's core idea - hyperlinked pages - in 1945, but the first person to try to build out the concept was a freethinker named Ted Nelson who envisioned his own scheme in 1965. However, he had little success connecting digital bits on a useful scale, and his efforts were known only to an isolated group of disciples. Few of the hackers writing code for the emerging Web in the 1990s knew about Nelson or his hyperlinked dream machine.
At the suggestion of a computer-savvy friend, I got in touch with Nelson in 1984, a decade before Netscape. We met in a dark dockside bar in Sausalito, California. He was renting a houseboat nearby and had the air of someone with time on his hands. Folded notes erupted from his pockets, and long strips of paper slipped from overstuffed notebooks. Wearing a ballpoint pen on a string around his neck, he told me - way too earnestly for a bar at 4 o'clock in the afternoon - about his scheme for organizing all the knowledge of humanity. Salvation lay in cutting up 3 x 5 cards, of which he had plenty.
Although Nelson was polite, charming, and smooth, I was too slow for his fast talk. But I got an aha! from his marvelous notion of hypertext. He was certain that every document in the world should be a footnote to some other document, and computers could make the links between them visible and permanent. But that was just the beginning! Scribbling on index cards, he sketched out complicated notions of transferring authorship back to creators and tracking payments as readers hopped along networks of documents, what he called the docuverse. He spoke of "transclusion" and "intertwingularity" as he described the grand utopian benefits of his embedded structure. It was going to save the world from stupidity.
I believed him. Despite his quirks, it was clear to me that a hyperlinked world was inevitable - someday. But looking back now, after 10 years of living online, what surprises me about the genesis of the Web is how much was missing from Vannevar Bush's vision, Nelson's docuverse, and my own expectations. We all missed the big story. The revolution launched by Netscape's IPO was only marginally about hypertext and human knowledge. At its heart was a new kind of participation that has since developed into an emerging culture based on sharing. And the ways of participating unleashed by hyperlinks are creating a new type of thinking - part human and part machine - found nowhere else on the planet or in history.
Not only did we fail to imagine what the Web would become, we still don't see it today! We are blind to the miracle it has blossomed into. And as a result of ignoring what the Web really is, we are likely to miss what it will grow into over the next 10 years. Any hope of discerning the state of the Web in 2015 requires that we own up to how wrong we were 10 years ago.
The Dream of a Lifetime
You've likely heard stories about the birth of the PC: of Xerox PARC as the Mecca of computing; of its creation of the Alto, Ethernet, and the laser printer; of the Homebrew Computer Club, the MITS Altair, Bill Gates and the theft of his Micro-soft Basic; of Steve Jobs and Stephen Wozniak, the founding of Apple, and the Jobs visit to PARC that inspired the Macintosh.
But what you may not know about is the really early history. The stories of Doug Engelbart and John McCarthy, of the Augmentation Research Center, and of the early days of the Stanford University AI Lab (SAIL) are not well known. Yes, you may have heard that Engelbart invented the mouse, and that SAIL and Stanford led to companies like Sun and Cisco. But there are better stories, great and old ones from the early days of computing, about the events that led to personal computing as we know it.
In his wonderful new book, What the Dormouse Said..., John Markoff tells these stories.
IBM expands corporate search ambitions
IBM's mission to spice up corporate search and become a "Google for the enterprise" continues in earnest.
By the end of the year, Big Blue intends to release an update to its corporate information-management tools, which are designed to bring order to potentially thousands of data sources in a company's network.
Code-named Serrano, the product will use technologies including artificial intelligence and data mining to derive more meaning from corporate documents. It will also have a revamped search engine and front-end tool designed to make hunting for company information as straightforward as searching the Web, according to IBM.
At I.B.M., That Google Thing Is So Yesterday
Suddenly, the computer world is interesting again. The last three months of 2004 brought more innovation, faster, than users have seen in years. The recent flow of products and services differs from those of previous hotly competitive eras in two ways. The most attractive offerings are free, and they are concentrated in the newly sexy field of "search."
Google, current heavyweight among systems for searching the Internet, has not let up from its pattern of introducing features and products every few weeks. Apart from its celebrated plan to index the contents of several university libraries, Google has recently released "beta" (trial) versions of Google Scholar, which returns abstracts of academic papers and shows how often they are cited by other scholars, and Google Suggest, a weirdly intriguing feature that tries to guess the object of your search after you have typed only a letter or two. Give it "po" and it will show shortcuts to poetry, Pokémon, post office, and other popular searches. (If you stop after "p" it will suggest "Paris Hilton.") In practice, this is more useful than it sounds.
Microsoft, heavyweight of the rest of computerdom, has scrambled to catch up with search innovations from Google and others. On Dec. 10, a company official made a shocking disclosure. For years Microsoft had emphasized the importance of "WinFS," a fundamentally new file system that would make it much easier for users to search and manage information on their own computers. Last summer, the company said that WinFS would not be ready in time for inclusion with its next version of Windows, called Longhorn. The latest news was that WinFS would not be ready even for the release after that, which pushed its likely delivery at least five years into the future. This seemed to put Microsoft entirely out of the running in desktop search. But within three days, it had released a beta version of its new desktop search utility, which it had previously said would not be available for months.
Meanwhile, a flurry of mergers, announcements and deals from smaller players produced a dazzling variety of new search possibilities. Early this month Yahoo said it would use the excellent indexing program X1 as the basis for its own desktop search system, which it would distribute free to its users. The search company Autonomy, which has specialized in indexing corporate data, also got into the new competition, as did Ask Jeeves, EarthLink, and smaller companies like dTSearch, Copernic, Accoona and many others.
I have most of these systems running all at once on my computer, and if they don't melt it down or blow it up I will report later on how each works. But today's subject is the virtually unpublicized search strategy of another industry heavyweight: I.B.M.
In Google We Trust?
From Jon Udell's blog:
Dave Winer today points to an Scott Rosenberg's excellent take on Google's new library venture. Scott concludes:
The public has a big interest in making sure that no one business has a chokehold on the flow of human knowledge. As long as Google's amazing project puts more knowledge in more hands and heads, who could object? But in this area, taking the long view is not just smart -- it's ethically essential. So as details of Google's project emerge, it will be important not just to rely on Google's assurances but to keep an eye out for public guarantees of access, freedom of expression and limits to censorship. Scott Rosenberg
Google Is Adding Major Libraries to Its Database
Google, the operator of the world's most popular Internet search service, plans to announce an agreement today with some of the nation's leading research libraries and Oxford University to begin converting their holdings into digital files that would be freely searchable over the Web.
It may be only a step on a long road toward the long-predicted global virtual library. But the collaboration of Google and research institutions that also include Harvard, the University of Michigan, Stanford and the New York Public Library is a major stride in an ambitious Internet effort by various parties. The goal is to expand the Web beyond its current valuable, if eclectic, body of material and create a digital card catalog and searchable library for the world's books, scholarly papers and special collections.
Google Plans New Service for Scientists and Scholars
Google Inc. plans to announce on Thursday that it is adding a new search service aimed at scientists and academic researchers.
Google Scholar, which was scheduled to go online Wednesday evening at scholar.google.com, is a result of the company's collaboration with a number of scientific and academic publishers and is intended as a first stop for researchers looking for scholarly literature like peer-reviewed papers, books, abstracts and technical reports.
Even Digital Memories Can Fade
The nation's 115 million home computers are brimming over with personal treasures - millions of photographs, music of every genre, college papers, the great American novel and, of course, mountains of e-mail messages.
Yet no one has figured out how to preserve these electronic materials for the next decade, much less for the ages. Like junk e-mail, the problem of digital archiving, which seems straightforward, confounds even the experts.
Summarizer gets the idea
The flow of a document, including the topics covered and the ways those topics relate to each other, is clear to people. It would be useful if computer systems that process documents -- like search engines and programs that generate summaries of news articles -- could also learn to consider topic information.
Teaching a computer to discern a document's topics and create a summary that puts the topics in the correct order is a bit like teaching it how to put together the pieces of a jigsaw puzzle. Current methods focus on finding the right match for a given piece.
Researchers from the Massachusetts Institute of Technology and Cornell University have developed a system that does the equivalent of putting pieces that show parts of a mountain and pieces that show parts of the sky into separate groups, and putting the sky pieces above the mountain pieces, said Lillian Lee, an associate professor of computer science at Cornell University.
Faceted Classification
Faceted classifications are increasingly common on the World Wide Web, especially on commercial web sites (Adkisson 2003). This is not surprising--facets are a natural way of organizing things. Many web designers have probably rediscovered them independently by asking, "What other ways would people want to view this data? What's another way to slice it?" A survey of the literature on applying facets on the web (Denton 2003) shows that librarians think it a good idea but are unsure how to do it, while the web people who are already doing it are often unaware of S.R. Ranganathan, the Classification Research Group, and the decades of history behind facets.
"Aristotle" (The Knowledge Web)
(DANNY HILLIS:) I have always envied Alexander the Great, because he had Aristotle as a personal tutor. In those days, Aristotle knew pretty much everything there was to know. Even better, Aristotle understood the mind of Alexander. He understood which topics interested Alexander, what Alexander knew and did not know, and what kinds of explanations Alexander preferred. Aristotle had been a student of Plato, and he was himself a great teacher. We know from his writings that he was full of examples, explanations, arguments, and stories. Through Aristotle, Alexander had the knowledge of the world at his command.
Of course no one today knows all that is known, in the sense that Aristotle did. Now there is far too much knowledge for that to be possible. The scientific revolution, and the technological revolution that followed it, led to a self-reinforcing explosion of knowledge. The explosion continues. Today not even the most highly trained scientist, the most scholarly historian, or the most competent engineer can hope to have more than a general overview of what is known. Only specialists understand most of the new discoveries in science, and even the specialists have trouble keeping up.
This problem isn't new. In 1945, Vannevar Bush wrote an essay for Atlantic Monthly about out the problem of too much knowledge. He wrote,
Scirus
Here there be data: Mapping the landscape of science
In ancient maps of the world, expanses of unknown territory might hold a warning to would-be explorers: Here there be monsters. For today's explorers seeking to navigate and understand the world of science, the monsters are the untamed collections of data that inhabit a largely uncharted landscape.
The April 6, 2004, issue of the Proceedings of the National Academy of Sciences (PNAS) features nearly 20 articles by some of tomorrow's mapmakers. Representing the computer, information and cognitive sciences, mathematics, geography, psychology and other fields, these researchers present attempts to create maps of science from the ever-growing and constantly evolving ocean of digital data.
