Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Maximum execution time exceeded in Parser/Cursor.php when processing large base64 string. #1026

Open
Duberry opened this issue Jun 7, 2024 · 2 comments

Comments

@Duberry
Copy link

Duberry commented Jun 7, 2024

Version(s) affected

2.4.2

Description

When I have a large base64 image in markdown directly adjacent to other text, Cursor.php exceeds execution time at line 180

      if ($this->isMultibyte) {
            return $this->charCache[$this->currentPosition] ??= \mb_substr($this->line, $this->currentPosition, 1, 'UTF-8');
        }

When there is a carriage return separating the base64 image from text the error does not occur.

How to reproduce

Place a large base64 image into markup adjacent to plain markup, try to generate html. Timeout.

This is my markup.![image.png](data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAABrEAAAJUCAYAAACyig2eAAAgAElEQVR4XuydB2AUZdrH/9uSQCD03kFUioIgUoRTBAEVBXvXs9x3nufZ61nA3vvZe0exgIqAHUUEqdJ7E6SEDiF98z3POzOb2c1udhOWGOA/3xeT7M68885v3nAwv/......................

I have not been able to determine the exact length of base64 string required to cause the issue. It's large, but less than 1Mb.

Possible solution

Cache the string if it's multibyte.....

  public function __construct(string $line)
    {
        if (! \mb_check_encoding($line, 'UTF-8')) {
            throw new UnexpectedEncodingException('Unexpected encoding - UTF-8 or ASCII was expected');
        }
    
        $this->line            = $line;
        $this->length          = \mb_strlen($line, 'UTF-8') ?: 0;
        $this->isMultibyte     = $this->length !== \strlen($line);
        $this->lastTabPosition = $this->isMultibyte ? \mb_strrpos($line, "\t", 0, 'UTF-8') : \strrpos($line, "\t");
    
        // Cache the entire string if it's multibyte for faster access
        if ($this->isMultibyte) {
            $this->charCache = \preg_split('//u', $line, -1, PREG_SPLIT_NO_EMPTY);
        }
    }
public function getCurrentCharacter(): ?string
    {
        if ($this->currentPosition >= $this->length) {
            return null;
        }
    
        if ($this->isMultibyte) {
            return $this->charCache[$this->currentPosition] ?? null;
        }
    
        return $this->line[$this->currentPosition];
    }

Additional context

No response

Did this project help you today? Did it make you happy in any way?

No response

@colinodell
Copy link
Member

Thanks for the report!

I think I tried your suggested approach many years ago and found that it was slower? But I could be misremembering that, so I do think it's worthwhile to revisit this.

Maybe we should also add some configuration options to limit/avoid parsing of excessively long lines?

@colinodell
Copy link
Member

(oops, I didn't mean to hit the close button 😅)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants