-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support of PyPy interpreter (smmap, gitdb, and GitPython) #49
Comments
I wouldn't worry too much about it and rather deactivate the test under PyPy. By now, I would be surprised if smmap is working well at all, and believe that the recommended codepaths in GitPython won't be using it anyway. Once PyPy works, please feel free to submit a PR and add it to the workflow. |
The failure occurs with SlidingWindowMapManager, not StaticWindowMapManager, the tests pass when SlidingWindowMapManager is not used. diff --git a/smmap/test/test_buf.py b/smmap/test/test_buf.py
index f0a86fb..0a6eea8 100644
--- a/smmap/test/test_buf.py
+++ b/smmap/test/test_buf.py
@@ -82,5 +85,5 @@ class TestBuf(TestBase):
for item in (fc.path, fd):
- for manager, man_id in ((man_optimal, 'optimal'),
- (man_worst_case, 'worst case'),
- (static_man, 'static optimal')):
+ for manager, man_id in (#(man_optimal, 'optimal'),
+ #(man_worst_case, 'worst case'),
+ (static_man, 'static optimal'),):
buf = SlidingWindowMapBuffer(manager.make_cursor(item) gitdb and GitPython fail their tests too, my guess is that they seem to rely on CPython-specific behavior of |
It's also use of del that causes the smmap to fail. diff --git a/smmap/test/test_buf.py b/smmap/test/test_buf.py
index f0a86fb..0ab7310 100644
--- a/smmap/test/test_buf.py
+++ b/smmap/test/test_buf.py
@@ -1,126 +1,126 @@
from .lib import TestBase, FileCreator
from smmap.mman import (
SlidingWindowMapManager,
StaticWindowMapManager
)
from smmap.buf import SlidingWindowMapBuffer
from random import randint
from time import time
import sys
import os
man_optimal = SlidingWindowMapManager()
man_worst_case = SlidingWindowMapManager(
window_size=TestBase.k_window_test_size // 100,
max_memory_size=TestBase.k_window_test_size // 3,
max_open_handles=15)
static_man = StaticWindowMapManager()
class TestBuf(TestBase):
def test_basics(self):
with FileCreator(self.k_window_test_size, "buffer_test") as fc:
# invalid paths fail upon construction
c = man_optimal.make_cursor(fc.path)
self.assertRaises(ValueError, SlidingWindowMapBuffer, type(c)()) # invalid cursor
self.assertRaises(ValueError, SlidingWindowMapBuffer, c, fc.size) # offset too large
buf = SlidingWindowMapBuffer() # can create uninitialized buffers
- assert buf.cursor() is None
- # can call end access any time
- buf.end_access()
- buf.end_access()
- assert len(buf) == 0
+ with (buf, c):
+ assert buf.cursor() is None
- # begin access can revive it, if the offset is suitable
- offset = 100
- assert buf.begin_access(c, fc.size) == False
- assert buf.begin_access(c, offset) == True
- assert len(buf) == fc.size - offset
- assert buf.cursor().is_valid()
+ # can call end access any time
+ buf.end_access()
+ buf.end_access()
+ assert len(buf) == 0
- # empty begin access keeps it valid on the same path, but alters the offset
- assert buf.begin_access() == True
- assert len(buf) == fc.size
- assert buf.cursor().is_valid()
+ # begin access can revive it, if the offset is suitable
+ offset = 100
+ assert buf.begin_access(c, fc.size) == False
+ assert buf.begin_access(c, offset) == True
+ assert len(buf) == fc.size - offset
+ assert buf.cursor().is_valid()
- # simple access
- with open(fc.path, 'rb') as fp:
- data = fp.read()
- assert data[offset] == buf[0]
- assert data[offset:offset * 2] == buf[0:offset]
+ # empty begin access keeps it valid on the same path, but alters the offset
+ assert buf.begin_access() == True
+ assert len(buf) == fc.size
+ assert buf.cursor().is_valid()
- # negative indices, partial slices
- assert buf[-1] == buf[len(buf) - 1]
- assert buf[-10:] == buf[len(buf) - 10:len(buf)]
+ # simple access
+ with open(fc.path, 'rb') as fp:
+ data = fp.read()
+ assert data[offset] == buf[0]
+ assert data[offset:offset * 2] == buf[0:offset]
- # end access makes its cursor invalid
- buf.end_access()
- assert not buf.cursor().is_valid()
- assert buf.cursor().is_associated() # but it remains associated
+ # negative indices, partial slices
+ assert buf[-1] == buf[len(buf) - 1]
+ assert buf[-10:] == buf[len(buf) - 10:len(buf)]
- # an empty begin access fixes it up again
- assert buf.begin_access() == True and buf.cursor().is_valid()
- del(buf) # ends access automatically
- del(c)
+ # end access makes its cursor invalid
+ buf.end_access()
+ assert not buf.cursor().is_valid()
+ assert buf.cursor().is_associated() # but it remains associated
+
+ # an empty begin access fixes it up again
+ assert buf.begin_access() == True and buf.cursor().is_valid()
assert man_optimal.num_file_handles() == 1
# PERFORMANCE
# blast away with random access and a full mapping - we don't want to
# exaggerate the manager's overhead, but measure the buffer overhead
# We do it once with an optimal setting, and with a worse manager which
# will produce small mappings only !
max_num_accesses = 100
fd = os.open(fc.path, os.O_RDONLY)
for item in (fc.path, fd):
for manager, man_id in ((man_optimal, 'optimal'),
(man_worst_case, 'worst case'),
- (static_man, 'static optimal')):
- buf = SlidingWindowMapBuffer(manager.make_cursor(item))
- assert manager.num_file_handles() == 1
- for access_mode in range(2): # single, multi
- num_accesses_left = max_num_accesses
- num_bytes = 0
- fsize = fc.size
+ (static_man, 'static optimal'),):
+ with SlidingWindowMapBuffer(manager.make_cursor(item)) as buf:
+ assert manager.num_file_handles() == 1
+ for access_mode in range(2): # single, multi
+ num_accesses_left = max_num_accesses
+ num_bytes = 0
+ fsize = fc.size
- st = time()
- buf.begin_access()
- while num_accesses_left:
- num_accesses_left -= 1
- if access_mode: # multi
- ofs_start = randint(0, fsize)
- ofs_end = randint(ofs_start, fsize)
- d = buf[ofs_start:ofs_end]
- assert len(d) == ofs_end - ofs_start
- assert d == data[ofs_start:ofs_end]
- num_bytes += len(d)
- del d
- else:
- pos = randint(0, fsize)
- assert buf[pos] == data[pos]
- num_bytes += 1
- # END handle mode
- # END handle num accesses
+ st = time()
+ buf.begin_access()
+ while num_accesses_left:
+ num_accesses_left -= 1
+ if access_mode: # multi
+ ofs_start = randint(0, fsize)
+ ofs_end = randint(ofs_start, fsize)
+ d = buf[ofs_start:ofs_end]
+ assert len(d) == ofs_end - ofs_start
+ assert d == data[ofs_start:ofs_end]
+ num_bytes += len(d)
+ del d
+ else:
+ pos = randint(0, fsize)
+ assert buf[pos] == data[pos]
+ num_bytes += 1
+ # END handle mode
+ # END handle num accesses
- buf.end_access()
- assert manager.num_file_handles()
- assert manager.collect()
- assert manager.num_file_handles() == 0
- elapsed = max(time() - st, 0.001) # prevent zero division errors on windows
- mb = float(1000 * 1000)
- mode_str = (access_mode and "slice") or "single byte"
- print("%s: Made %i random %s accesses to buffer created from %s reading a total of %f mb in %f s (%f mb/s)"
- % (man_id, max_num_accesses, mode_str, type(item), num_bytes / mb, elapsed, (num_bytes / mb) / elapsed),
- file=sys.stderr)
- # END handle access mode
- del buf
+ buf.end_access()
+ assert manager.num_file_handles()
+ assert manager.collect()
+ assert manager.num_file_handles() == 0
+ assert res
+ elapsed = max(time() - st, 0.001) # prevent zero division errors on windows
+ mb = float(1000 * 1000)
+ mode_str = (access_mode and "slice") or "single byte"
+ print("%s: Made %i random %s accesses to buffer created from %s reading a total of %f mb in %f s (%f mb/s)"
+ % (man_id, max_num_accesses, mode_str, type(item), num_bytes / mb, elapsed, (num_bytes / mb) / elapsed),
+ file=sys.stderr)
+ # END handle access mode
# END for each manager
# END for each input
os.close(fd) |
Thanks for figuring this out! Yes, there were times when Python was reference counted, which allowed to trigger destructors/finalizers somewhat consistently, but these times are long gone. I don't know to what extend it's worth putting time into this - it's clearly not working anymore (at least not as intended) and nobody seems to have taken issue with this. Maybe it's used nowhere anymore. The reason it can't just be removed is the reluctance to break backwards compatibility. If you think it's worth making some adjustments to make the tests pass and take it where it needs to be, please feel free to submit PRs for that. |
The tests pass with the change above, but it would only fix testing of smmap ; gitdb and GitPython are also broken on PyPy, for the same reasons (closing stuff in destructors, perhaps also how they use smmap, I haven't exhaustively checked). I can't promise I'll find the time to fix all of this by myself... |
Understandable! Since What about detecting, in GitPython, that the runtime environment is PyPy, and based on that certain features won't be imported or made available at all. That way, one can focus on what matters, and probably get to run all tests successfully as well. |
I was hoping to get this to run under PyPy but this is happening.
I may take a look and suggest a fix, but maybe a someone who knows the codebase can figure this problem out easily?
Edit: the code makes a heavy use of
del
expecting instant behavior, but this is not automatic under pypy. See http://docs.cython.org/en/latest/src/userguide/pypy.htmlThe text was updated successfully, but these errors were encountered: