Two Catalysts for Qualitative Change
Richard Snodgrass
October 1,1999
Location
- City and State, 2000 BCE
- Longitude, 1773 CE
- GPS + cell phone, 1999 CE
Confluences
- Underlying technologies
- Highly accurate atomic clocks
- Geosynchronous satellites
- Advances in micro-circuitry
- Proliferation of cell phones
- Demonstrated need
- Catalyst: companies able to produce in quantity at low price
- Qualitative change
The Vision
The ACM Computing Portal
A web-based repository of bibliographic information
-
contains information on all papers and books in the computing literature
-
contains a pointer to the digitized version, if available
Objectives
- Qualitatively increase the effectiveness of scientific research into computing
- Continue to place ACM as the premier scientific and educational organization for computing
- Increase service of ACM and the SIGs to the scientific community
- Provide a concrete illustration of the scope of computer science
Presentation
Components
- Bibliographic Entries
- Abstracts and Keywords
- Full Text
- Citation Linking
- Demonstratio
- Realizing the Computing Portal
- Revisit the components
- The Next Step
Step 1: Bibliographic Entries
- Collect all bibliographic entries from all computer science journals, conferences, workshops, technical bulletins, and books.
- Over the period from 1940 to 2000
- Approximately 1M entries
- Provide free searching on the web.
- Provide citations in multiple formats: HTML, BiBTeX, refer, Word, ...
Step 2: Abstracts and Keywords
- Collect keywords, and later, abstracts, for all entries.
- Copyright restrictions on some abstracts?
Step 3: Full Text and Images
Collect full text of each available paper and book for
- use in searching
- to develop classification maps and lexicons
- other analyses
Step 4: Citation Linking
- Start with full text of paper's bibliography.
- Out linking: identify bibliographic entry of papers referenced by the paper
- In linking: identify bibliographic entries of papers referencing the paper
- Use for citation analysis, knowledge diffusion studies
Demonstration
Papers with wavelet:
Stage 1: Bibliographic Entries
Propose that each SIG be responsible for collecting
relevant entries.
- ensure completeness, based on SIG interests
- reduce overlap between SIGs
- ensure correctness
Software for data entry, validation, and conversion provided to SIGs
1M entries / 36 SIGs = 30K entries per SIG
- e.g., SIGMOD: approximately 50K entries
Many resources
- DBLP: 130K entries
- Propose that ACM donate the ACM Guide to Computing Literature: 200K entries
- Collection of Computer Science Bibliographies: 930K entries
Stage 2: Keywords and Abstracts
- Propose that SIGs collect these.
- May need copyright permission, negotiated by ACM HQ
- Collection of CS bibliographies has 100K abstracts
Stage 3: Full Text
Propose SIGs fund populating full ACM Digital Library.
- PDF files containing encapsulated TIFF and OCRed full text
- 99% accuracy
- $1.25 per page.
Could go to SGML or XML, 99.9% accuracy: $8-$10 per page.
Populating the ACM DL
- Journals: 130K pages: $200K
- Conference and workshop proceedings: 500K pages: $600K
- Newsletters: 200K pages: $250K
- Total: 850K pages at $1050K
- $30K per SIG
Stage 3: Full Text, cont.
ACM papers: 850K pages, or about 50K papers
- This represents 5% of total of 1M papers.
ACM books: obtain full text from publishers.
For remaining conference proceedings,
- Offer full CD Rom package at cost in exchange for inclusion in CD Rom and use of full text for searching.
- Pay for digitization out of conference profits
- e.g., IEEE ICDE: 600 pages x 17 years x $1.25 = $13K.
- SIGs pay for integration: $0.25 - $0.50 per page.
Stage 3: Journal Papers
For other journals,
- Same offer as with conferences
- Or, offer URL into their DL in exchange for full text, only for searching
- ACM Computing Portal provides valuable entry into their DL, enhancing their revenue stream.
For other books, make same offer.
Open Architecture
- Free searching via web interface, including full text search, at ACM site and SIG portals
- Bibliographic data available for other search engines
- As much PDF available for free as possible
- Encourage digitization of corpus
Summary
The ACM Computing Portal
- Free searchable access to the entire computer science corpus
- SIG-specific portals
- Fully populated ACM DL
- Inclusion of or portal to other DL resources
- Capability to purchase papers and to register queries
- Possibly ancillary SIG-provided benefits, such as CD-ROMs
SGB Portal Committee
- Rick Snodgrass (University of Arizona, CS), chair
- Steve Cunningham (Cal State University-Stanislaus, CS)
- Mary Fernandez (AT&T Labs)
- Carol Hutchins (Courant Institute of Math. Sci. Library)
- Bob Krovetz (NEC Research Institute)
- Michael Ley (University of Trier, CS)
- Andreas Paepcke (Stanford University)
- Kathy Preas (KP Pubs on CDROM)
- Charles Viles (Univ. of North Carolina, Info and Lib Sci)
Individual SIG Commitments
- Collect and capture SIG-relevant bibliographic entries, abstracts, and keywords, in appropriate format.
Allocate funds to populate the ACM DL: journals, conference and workshop proceedings, SIG newsletter.
- Roughly $30K for each SIG
- SIGDA matching funds: $50K
Negotiate with steering committees of associated conferences and workshops.
ACM HQ Commitments
- Donate entries from ACM Guide to Computing Literature.
- Negotiate cross-use agreements with associated societies.
- Acquire full text of books copyrighted by ACM.
- Provide hardware and software to host CSP.
- Provide staff to manage CSP, with content provided by SIGs.
ACM HQ Opportunities
- Integrate CSP with CoRR
- Provide print and CD-ROM versions of the expanded ACM Guide to Computing Literature
- Fully populated DL
- Increased visibility of ACM
Confluences
Underlying technologies
- Inexpensive scanning, OCR, disk space, inexpensive, high capacity CD-ROM
Demonstrated need
Catalysts: ACM Council and SIG Governing Board
Qualitative change
Publish with ACM
ACM's prestigious conferences and journals seek top-quality papers in all areas of computing and IT. It is now easier than ever to find the most appropriate venue for your research and publish with ACM.
![Publish your work](/binaries/ctaimagelarge/content/gallery/acm/ctas/publish-your-work.jpg)
Lifelong Learning
ACM offers lifelong learning resources including online books and courses from Skillsoft, TechTalks on the hottest topics in computing and IT, and more.
![techpacks](/binaries/ctaimagelarge/content/gallery/acm/ctas/techpacks.jpg)
ACM Case Studies
Written by leading domain experts for software engineers, ACM Case Studies provide an in-depth look at how software teams overcome specific challenges by implementing new technologies, adopting new practices, or a combination of both. Often through first-hand accounts, these pieces explore what the challenges were, the tools and techniques that were used to combat them, and the solution that was achieved.
![](/binaries/ctaimagelarge/content/gallery/acm/ctas/publications/queue-case-studies-2.jpg)