1 1
MyLifeBits:
Attempting to realize the Memex Vision
Gordon Bell
Gordon Bell
February 2003
February 2003
http://research.microsoft.com/barc/MediaPresence/MyLifeBits.aspx
http://research.microsoft.com/barc/MediaPresence/MyLifeBits.aspx
With Jim Gemmell & Roger Lueder
2
Outline … MyLifeBits
Background…fulfilling the Memex vision
Background…fulfilling the Memex vision
Cyberizing everything
Cyberizing everything
File to database transition
File to database transition
Use…beyond search
Use…beyond search
3
Memex
Posited by Vannevar Bush in “As We May Think” The Atlantic Monthly, July 1945
“
“A memex is a device in which an individual stores all A memex is a device in which an individual stores all his books, records, and communications, and which
his books, records, and communications, and which
is mechanized so that it may be consulted with
is mechanized so that it may be consulted with
exceeding speed and flexibility”
exceeding speed and flexibility”
Supports: Annotations, links between documents, and
Supports: Annotations, links between documents, and
“trails” through the documents
“trails” through the documents
“
“yet if the user inserted 5000 pages of material a day it yet if the user inserted 5000 pages of material a day it would take him hundreds of years to fill the
would take him hundreds of years to fill the
repository, so that he can be profligate and enter
repository, so that he can be profligate and enter
material freely”
4
6
7
Memory Overload
As hard drives get bigger and cheaper,
As hard drives get bigger and cheaper,
we're storing way too much.
we're storing way too much. By Jim Lewis
By Jim Lewis
There's a famous allegory about a map of the There's a famous allegory about a map of the world that grows in detail until every point in world that grows in detail until every point in
reality has its counterpoint on paper; the twist reality has its counterpoint on paper; the twist
being that such a map is at once ideally accurate being that such a map is at once ideally accurate
and entirely useless, since it's the same size as and entirely useless, since it's the same size as
8
"The PC is going to be the place where you store the information and really the center of control“ Billg 1/7/2001
MyLifeBits is a project to “cyberize” everything!
MyLifeBits is a project to “cyberize” everything!
What? Recall of all articles, books, CDs,
What? Recall of all articles, books, CDs,
photos, video, communication (e.g. mail, phone),
photos, video, communication (e.g. mail, phone),
web
web
Why? …
Why? …
“because we can”
“because we can”
Office: communicate, store, & workOffice: communicate, store, & work
Home & Media Center: ambiance &entertainmentHome & Media Center: ambiance &entertainment
Immortality for progeny. Memory aidsImmortality for progeny. Memory aids
Goal: to understand the 1 TByte PC c2006:
Goal: to understand the 1 TByte PC c2006:
9
Gordon: Researcher, consumer, computer system tester,
nerd wanna-be, and average man
Melissa: middle manager
Patrick: Consultant
Nicholas: Analyst
Sondra: Office manager
10
The guinea pig
Gordon Bell is digitizing his lifeGordon Bell is digitizing his life Has now scanned virtually all:Has now scanned virtually all:
Books written (and read when possible)Books written (and read when possible)
Personal documents (correspondence including memos and email, Personal documents (correspondence including memos and email,
bills, legal documents, papers written, …)
bills, legal documents, papers written, …) PhotosPhotos
Posters, paintings, photo of things (artifacts, …medals, plaques)Posters, paintings, photo of things (artifacts, …medals, plaques) Home movies and videosHome movies and videos
CD collectionCD collection
And, of course, all PC filesAnd, of course, all PC files
Now recording: phone, radio, TV (movies), web pages… Now recording: phone, radio, TV (movies), web pages…
conversations?
conversations?
11
12
13
Input: tools, time, and cost
Scanners: HP Digital Sender, flat beds with ADF, 2-Scanners: HP Digital Sender, flat beds with ADF,
2-HP photo, faxing. (Duplex, color, feed-thru, etc.)
HP photo, faxing. (Duplex, color, feed-thru, etc.)
A good commercial scanner costs 2K-10KA good commercial scanner costs 2K-10K
Photos: $1 or 0.5-5 min. Photos: $1 or 0.5-5 min.
Large posters: ~ 1-5 hr.
Large posters: ~ 1-5 hr.
Artifacts: ~ 10 min. including photo
Artifacts: ~ 10 min. including photo
Scanning to TIF, PDF: <1 min/page or .10/page Scanning to TIF, PDF: <1 min/page or .10/page
OCR: for MODI or PDF: ~3-5 pages/min (old data)OCR: for MODI or PDF: ~3-5 pages/min (old data) OCR: to recreate an editable “original” 10 min/page!OCR: to recreate an editable “original” 10 min/page!
14 Music 6.9 GB 1.8K files 180 CDs Working 2.3 GB 432 folders 2.9K files Archive 5.1 GB 477 folders 18.7 K files
Video 2.6 GB 10 hours Low res My Books 98 MB
27.1K files & 42K .msg
17.7 GB (by size)
Files (by number)
.xls .jpg .doc/html .pdf .ppt/ppt albums .tif
CyberAll Nov.1, 2001
16
17
MyLifeBits organization: time and space
Timeline/ Context
(space)
Personal (some $s)
GB Co.
(angel, etc.)
Professional ACM, etc., …
@Microsoft.com,
New co’s.
18
MyLifeBits: Some Lives(t)
PersonalPersonal
Parents, children, grandkidsParents, children, grandkids CGB himselfCGB himself
Close friendsClose friends
GB $sGB $s
Personal incl. several legal Personal incl. several legal
structures
structures
Investments & boardsInvestments & boards
Past companies/organiz’nsPast companies/organiz’ns
DECDEC
Carnegie-Mellon U.Carnegie-Mellon U.
DEC, NSF, Encore, Ardent, DEC, NSF, Encore, Ardent,
GB_consulting,
GB_consulting,
CGB@ MicrosoftCGB@ Microsoft
MLBMLB ClustersClusters
TelepresenceTelepresence WWW presenceWWW presence
Computer History MuseumComputer History Museum
BOD memberBOD member Fund-raisingFund-raising CyberMuseumCyberMuseum
StartupsStartups
Bell-Mason DirectorBell-Mason Director
19
MyLifeBits is:
Memex and more (audio and video)Memex and more (audio and video)
Universal store for all personal stuffUniversal store for all personal stuff
Guiding principles for the system:Guiding principles for the system:
1.
1. Full text search & Full text search & collectionscollections (> than hierarchy) (> than hierarchy)
2.
2. Visualizations for search, display, insightVisualizations for search, display, insight
3.
3. Annotations and links add value and essentialAnnotations and links add value and essential
Increase search ability and value of information.Increase search ability and value of information. So make many kinds and them easy to create!So make many kinds and them easy to create! Stories are the ultimate annotationStories are the ultimate annotation
4.
20
MLB database: size and content?
Database features are essential: Database features are essential: Consistency, Indexing, Consistency, Indexing,
Pivoting, Queries, Speed/scalability, Backup, replication.
Pivoting, Queries, Speed/scalability, Backup, replication.
Folders &Files were the starting point >> database into sets Folders &Files were the starting point >> database into sets
aka “collections” that are identical to the folder structure
aka “collections” that are identical to the folder structure
Outlook (msgs, attachments, calendar, contacts)Outlook (msgs, attachments, calendar, contacts) Web trails including voice message annotation Web trails including voice message annotation
Journal (Outlook), trails: every document use & transactionJournal (Outlook), trails: every document use & transaction What about? What about?
Money (transactions, payees, etc.)…is their lifelog/trailMoney (transactions, payees, etc.)…is their lifelog/trail Streets and trips to cross-index to all docsStreets and trips to cross-index to all docs
22
23 CD VCR Cassette Plasma Panel DVD Media Center Computer Set top Set top Kbd Mse Wfr Spkr Spkr IR Cable/ Satellite Ethernet SVHS-wide 5.1 digital 5 speakers stereo stereo stereo Video* 5.1 digital comp. stereo Video* Video* Cables/links Speaker 5+1 Plasma 2 or 3 Cable/Enet 2 IR 8
Stereo 4
5.1 digital 2
Comp./S-video 3 Plasma panel 1 Power 10
Kbd/mse 2
Monitor II (opt.) 4 Camera 2
Total 42 – 46
Things 18+remotes
*Video = composite or S-video
24
25 25
Caneel Bay Vacation Jan. 1998
Gordon, Gwen, Brig, Pam,
Gordon, Gwen, Brig, Pam,
Fiona, Bob, Laura and Kolbe
26
Searching: the most useful app?
Challenge: What questions for useful results?
Challenge: What questions for useful results?
Lots of ways to look at what you retrieve
Lots of ways to look at what you retrieve
Need for breaking the returns into segments
Need for breaking the returns into segments
Searching for an indexer and search engine:
Searching for an indexer and search engine:
index service, Enfish, dtSearch
index service, Enfish, dtSearch
Stuff I’ve Seen MSR’s index & search…
Stuff I’ve Seen MSR’s index & search…
evolving in the right direction.
evolving in the right direction.
Productizing would remove the pressure for Productizing would remove the pressure for
Longhorn
30
31
Resource explorer
32
34
35
36
Visualization
Browsing & searching. “Get me what I want|need!”Browsing & searching. “Get me what I want|need!”
Help the user find things among possible items versusHelp the user find things among possible items versus
Waiting for an ideal system that can find “what I want”Waiting for an ideal system that can find “what I want”
Publication: Conventional & web, presentations, Publication: Conventional & web, presentations,
etc. etc.
Helps understand the nature of the content e.g. Helps understand the nature of the content e.g.
histogram of objects in time histogram of objects in time
Context: Links to help understand the relationship Context: Links to help understand the relationship
between objects. Provides more search handles. between objects. Provides more search handles.
Information density: what is it? Information density: what is it?
What is its relationship to others? What is its relationship to others?
37
Value of media depends on
annotations
38
System annotations provide base
level of value
39
Tracking usage – even better
Date 7/7/2000. Opened 30 times, emailed to 10 Date 7/7/2000. Opened 30 times, emailed to 10
people (its valued by the user!)
40
Get the user to say a little
something is a big jump
Date 7/7/2000. Opened 30 times, emailed to 10 people. “BARC Date 7/7/2000. Opened 30 times, emailed to 10 people. “BARC
dim sum intern farewell Lunch”
41
Getting the user to tell a story is the
ultimate in media value
A story is a “layout” in time and spaceA story is a “layout” in time and space
Most valuable content (by selection, and by being well annotated)Most valuable content (by selection, and by being well annotated)
Stories must include links to any media they use (for future navigation/search – Stories must include links to any media they use (for future navigation/search –
“transclusion”).
“transclusion”).
Cf: MovieMaker; Creative Memories PhotoAlbumsCf: MovieMaker; Creative Memories PhotoAlbums
Dapeng was an
Dapeng was an
intern at BARC
intern at BARC
for the summer
for the summer
of 2000
of 2000
We took him to
We took him to
lunch at our
lunch at our
favorite Dim Sum
favorite Dim Sum
place to say
place to say
farewell
farewell
At table L-R: Dapeng, Gordon, Tom, Jim, Don,
At table L-R: Dapeng, Gordon, Tom, Jim, Don,
Vicky, Patrick, Jim
42
Value of media depends on
annotations
Auto-annotate whenever Auto-annotate whenever
possible e.g. GPS cameras possible e.g. GPS cameras
Make manual annotation Make manual annotation
as easy as possible. XP as easy as possible. XP
photo capture, voice, photo capture, voice, photos with voice, etc photos with voice, etc
Support gang annotationSupport gang annotation
Make stories easyMake stories easy
45
The Agenda for the Tbyte(s), Lifetime, PC:
The killer app after office and mail.
1.
1. Guarantee that data will live forever! “dear appy” problemGuarantee that data will live forever! “dear appy” problem
2.
2. Cheap, easy, and data-rich (e.g. time, place) capture:Cheap, easy, and data-rich (e.g. time, place) capture:
GPS and time everywhere
GPS and time everywhere
Paper capture has to be as easy as discard (scanner/shredder)
Paper capture has to be as easy as discard (scanner/shredder)
E-book…e-magazines & journals need to have critical mass!
E-book…e-magazines & journals need to have critical mass!
Telephony and audio capture with indexing
Telephony and audio capture with indexing
Media Center compatible for entertainment (photos, video, TV, radio)
Media Center compatible for entertainment (photos, video, TV, radio)
3.
3. One?One? dbase for all books, conversations, mail, web pages … dbase for all books, conversations, mail, web pages …
vs. long-term use of hierarchical files.
vs. long-term use of hierarchical files. Is dbase intuitive?Is dbase intuitive?
4.
4. Annotations/meta-information add every-increasing valueAnnotations/meta-information add every-increasing value
Ease of annotation because it aids search and becomes the content
Ease of annotation because it aids search and becomes the content
Content analysis (critical for photo & video!)
Content analysis (critical for photo & video!)
5.
5. Information control: privacy, security, expunge/deniability,… Information control: privacy, security, expunge/deniability,…
6.
6. New “killer apps”: alzheimer, immortality, surrogate memory?New “killer apps”: alzheimer, immortality, surrogate memory?
7.
46
47
The “dear appy” problem
Dear Appy, Dear Appy,
How committed are you? How committed are you?
Please come back to me, Please come back to me,
Lost and forgotten data Lost and forgotten data
Who’s responsible?Who’s responsible?
mediamedia
platform, file, and databasesplatform, file, and databases
evolving standards and formatsevolving standards and formats
48
Digitizing our lives
Right now, it is affordable to buy 100 GB/yearRight now, it is affordable to buy 100 GB/year In 5 years In 5 years 1TB/year is afforadable!1TB/year is afforadable! It’s hard to fill a terabyte/year just by keeping what you see or It’s hard to fill a terabyte/year just by keeping what you see or
hear, but you can:
hear, but you can:
Look at 9800 pictures Look at 9800 pictures a daya day (300 KB JPEGs) (300 KB JPEGs) Read 2900 documents Read 2900 documents a daya day (1MB files) (1MB files)
Listening to audio or view compressed video 24 hours/day (it takes Listening to audio or view compressed video 24 hours/day (it takes
more than 256 kb/s to fill a TB in a year)
more than 256 kb/s to fill a TB in a year) Watch 1.5 Mb/s video 4 hours each day.Watch 1.5 Mb/s video 4 hours each day.