This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Bug in dbm - long strings in keys and values
Type: Stage:
Components: Library (Lib) Versions:
process
Status: closed Resolution: wont fix
Dependencies: Superseder:
Assigned To: Nosy List: jdcrunchman, loewis, tinolange
Priority: high Keywords:

Created on 2003-10-21 20:05 by jdcrunchman, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Messages (4)
msg18706 - (view) Author: John Draper (jdcrunchman) Date: 2003-10-21 20:05
#!/usr/local/bin/python
#2003-10-19. Feedback to crunch@shopip.com
import dbm

print """
Python dbm bugs summary:
  1. Long strings cause weirdness.
  2. Long keys fail without returning error.

This demonstrates serious bugs in the Python dbm 
module.
Present in OpenBSD versions 2.2, 2.3, and 2.3.2c1.

len(key+string)>61231 results in the item being 'lost', 
without warning.
If the key or string is one character shorter, it is fine.
Writing multiple long strings causes unpredictable results
(none, some, or all of the items are lost without warning).

Curiously, keys of length 57148 return an error, but
longer keys fail without warning
(sounds like an = instead of a > somewhere).
"""

mdb=dbm.open("mdb","n")
print "Writing 1 item to database, but upon reading,"
k='k'
v='X'*61230 #Long string
mdb[k]=v
mdb.close()

md=dbm.open("mdb","r")
print "database contains %i items"%len(md.keys())
md.close()
msg18707 - (view) Author: John Draper (jdcrunchman) Date: 2003-10-24 22:47
Logged In: YES 
user_id=891874

I upped the priority since this bug causes dbm database 
corruption, which could be disastrous for users of dbm or 
shelve.
msg18708 - (view) Author: Tino Lange (tinolange) Date: 2003-10-26 15:06
Logged In: YES 
user_id=212920

Hi!

This is no bug - at least no python bug.
It seems to be a limitation in the underlying dbm engine.

I tested it with different dbm implementations under Linux
today. You are right - when using dbm as ndbm.h (which
belongs to libc6, i.e. db1/ndbm.h) this error/limitation occurs.
Please test: At least for me the dbm engine still finds the
data if you specify it, a print md[k] shows the long v
string. Only the keys() and len() methods which iterate over
the database fail.

I also tested without Python, i. e. just using the C-API and
it's just the same (linking against libdb1.so). It is the
underlying dbm_firstkey() which fails when using that HUGE
keys. It just returns NULL.

Context:
--------

drec=dbm_firstkey(dbm);
while(drec.dptr) {
  size++;
  drec = dbm_nextkey(dbm);
}


If you happen to have a system that uses gdbm/ndbm.h as
dbm.h (for example SuSE) or if you change the links in your
include and lib directories accordingly and recompile Python
then this bug disappears and your program is working as
expected. The GNU gdbm engine (I tested 1.73 and 1.8.3)
seems to be much better in handling huge keys as the old
libc dbm.

If possible change to gdbm or assure that dbm is linked
against gdbm (-compat). setup.py tries to be smart about
that but sometimes chooses the old dbm even if it could use
gdbm instead.

ldd /usr/local/lib/python2.3/lib-dynload/dbm.so

shows you what lib you're using.

On my Debian system an
ln -sf /usr/include/gdbm/ndbm.h /usr/include/ndbm.h
ln -sf /usr/lib/libgdbm.so /usr/lib/libndbm.so
fixed it.

Ciao

Tino
msg18709 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2006-07-03 12:48
Logged In: YES 
user_id=21627

I'm closing this as a third-party bug. If there is any
evidence that there is something that Python could do to
reduce the impact of this problem, please submit this as a
new bug report.
History
Date User Action Args
2022-04-11 14:56:00adminsetgithub: 39439
2003-10-21 20:05:57jdcrunchmancreate