Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add io.Reader variants to lockfile package #176

Closed
picatz opened this issue Jan 30, 2023 · 18 comments · Fixed by #451
Closed

Add io.Reader variants to lockfile package #176

picatz opened this issue Jan 30, 2023 · 18 comments · Fixed by #451
Labels
enhancement New feature or request

Comments

@picatz
Copy link

picatz commented Jan 30, 2023

At this time, the lockfile package's external functions only accept a path name, so it expects a file to be available on disk. For my use case, this isn't always true or ideal. Preferably, there would be an option to use a path name and an io.Reader.

@cmaritan's work in #164 with ParseApkInstalledFromReader serves as a good foundation to be expanded upon:

func ParseApkInstalled(pathToLockfile string) ([]PackageDetails, error) {

func ParseApkInstalledFromReader(file io.ReadCloser, pathToLockfile string) ([]PackageDetails, error) {

@oliverchang oliverchang added the enhancement New feature or request label Jan 31, 2023
@oliverchang
Copy link
Collaborator

Hey @picatz, thanks for the suggestion!

Indeed this is something we'd like to have. We've also been discussing internally with some other potential users of this library around some other refactorings to make this package more usable generally (that includes this).

We'll hopefully have something ready to share for input soon :)

@G-Rath
Copy link
Collaborator

G-Rath commented Feb 3, 2023

fwiw I've made a start on this - patch is attached; main thing I'm still doing is figuring out how to make the JSON/YAML/TOML/XML parsers play nice with an io.Reader, and then the test suites for all parsers should probably be turned into a couple of table-based tests so that they can easily run against both the file and io.Reader versions of each function.

Am pausing for now in favor of #81, but will hopefully be able to have it done in a couple of weeks.

patch
Index: pkg/lockfile/parse-yarn-lock.go
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/pkg/lockfile/parse-yarn-lock.go b/pkg/lockfile/parse-yarn-lock.go
--- a/pkg/lockfile/parse-yarn-lock.go	(revision 5df9444f04bc86e3c64e68ef65bc8e2711edb32c)
+++ b/pkg/lockfile/parse-yarn-lock.go	(date 1675375026394)
@@ -3,6 +3,7 @@
 import (
 	"bufio"
 	"fmt"
+	"io"
 	"net/url"
 	"os"
 	"regexp"
@@ -170,19 +171,33 @@
 	}
 }
 
-func ParseYarnLock(pathToLockfile string) ([]PackageDetails, error) {
+func parseFileWithReader(pathToLockfile string, parserWithReader PackageDetailsParserWithReader) ([]PackageDetails, error) {
 	file, err := os.Open(pathToLockfile)
 	if err != nil {
 		return []PackageDetails{}, fmt.Errorf("could not open %s: %w", pathToLockfile, err)
 	}
 	defer file.Close()
 
-	scanner := bufio.NewScanner(file)
+	details, err := parserWithReader(file)
+
+	if err != nil {
+		err = fmt.Errorf("error while parsing %s: %w", pathToLockfile, err)
+	}
+
+	return details, err
+}
+
+func ParseYarnLock(pathToLockfile string) ([]PackageDetails, error) {
+	return parseFileWithReader(pathToLockfile, ParseYarnLockWithReader)
+}
+
+func ParseYarnLockWithReader(r io.Reader) ([]PackageDetails, error) {
+	scanner := bufio.NewScanner(r)
 
 	packageGroups := groupYarnPackageLines(scanner)
 
 	if err := scanner.Err(); err != nil {
-		return []PackageDetails{}, fmt.Errorf("error while scanning %s: %w", pathToLockfile, err)
+		return []PackageDetails{}, err
 	}
 
 	packages := make([]PackageDetails, 0, len(packageGroups))
Index: pkg/lockfile/parse-requirements-txt.go
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/pkg/lockfile/parse-requirements-txt.go b/pkg/lockfile/parse-requirements-txt.go
--- a/pkg/lockfile/parse-requirements-txt.go	(revision 5df9444f04bc86e3c64e68ef65bc8e2711edb32c)
+++ b/pkg/lockfile/parse-requirements-txt.go	(date 1675375316408)
@@ -2,8 +2,7 @@
 
 import (
 	"bufio"
-	"fmt"
-	"os"
+	"io"
 	"regexp"
 	"strings"
 )
@@ -92,15 +91,13 @@
 }
 
 func ParseRequirementsTxt(pathToLockfile string) ([]PackageDetails, error) {
+	return parseFileWithReader(pathToLockfile, ParseRequirementsTxtWithReader)
+}
+
+func ParseRequirementsTxtWithReader(r io.Reader) ([]PackageDetails, error) {
 	var packages []PackageDetails
 
-	file, err := os.Open(pathToLockfile)
-	if err != nil {
-		return packages, fmt.Errorf("could not open %s: %w", pathToLockfile, err)
-	}
-	defer file.Close()
-
-	scanner := bufio.NewScanner(file)
+	scanner := bufio.NewScanner(r)
 
 	for scanner.Scan() {
 		line := removeComments(scanner.Text())
@@ -113,7 +110,7 @@
 	}
 
 	if err := scanner.Err(); err != nil {
-		return packages, fmt.Errorf("error while scanning %s: %w", pathToLockfile, err)
+		return []PackageDetails{}, err
 	}
 
 	return packages, nil
Index: pkg/lockfile/parse-poetry-lock.go
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/pkg/lockfile/parse-poetry-lock.go b/pkg/lockfile/parse-poetry-lock.go
--- a/pkg/lockfile/parse-poetry-lock.go	(revision 5df9444f04bc86e3c64e68ef65bc8e2711edb32c)
+++ b/pkg/lockfile/parse-poetry-lock.go	(date 1675388656506)
@@ -3,7 +3,7 @@
 import (
 	"fmt"
 	"github.com/BurntSushi/toml"
-	"os"
+	"io"
 )
 
 type PoetryLockPackageSource struct {
@@ -25,18 +25,16 @@
 const PoetryEcosystem = PipEcosystem
 
 func ParsePoetryLock(pathToLockfile string) ([]PackageDetails, error) {
+	return parseFileWithReader(pathToLockfile, ParsePoetryLockWithReader)
+}
+
+func ParsePoetryLockWithReader(r io.Reader) ([]PackageDetails, error) {
 	var parsedLockfile *PoetryLockFile
 
-	lockfileContents, err := os.ReadFile(pathToLockfile)
+	_, err := toml.NewDecoder(r).Decode(&parsedLockfile)
 
 	if err != nil {
-		return []PackageDetails{}, fmt.Errorf("could not read %s: %w", pathToLockfile, err)
-	}
-
-	err = toml.Unmarshal(lockfileContents, &parsedLockfile)
-
-	if err != nil {
-		return []PackageDetails{}, fmt.Errorf("could not parse %s: %w", pathToLockfile, err)
+		return []PackageDetails{}, fmt.Errorf("could not parse: %w", err)
 	}
 
 	packages := make([]PackageDetails, 0, len(parsedLockfile.Packages))
Index: pkg/lockfile/parse.go
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/pkg/lockfile/parse.go b/pkg/lockfile/parse.go
--- a/pkg/lockfile/parse.go	(revision 5df9444f04bc86e3c64e68ef65bc8e2711edb32c)
+++ b/pkg/lockfile/parse.go	(date 1675384123861)
@@ -1,8 +1,10 @@
 package lockfile
 
 import (
+	"bytes"
 	"errors"
 	"fmt"
+	"io"
 	"path/filepath"
 	"sort"
 	"strings"
@@ -153,3 +155,10 @@
 		Packages: packages,
 	}, err
 }
+
+func readBytes(r io.Reader) []byte {
+	buf := new(bytes.Buffer)
+	buf.ReadFrom(r)
+
+	return buf.Bytes()
+}
Index: pkg/lockfile/parse-gradle-lock.go
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/pkg/lockfile/parse-gradle-lock.go b/pkg/lockfile/parse-gradle-lock.go
--- a/pkg/lockfile/parse-gradle-lock.go	(revision 5df9444f04bc86e3c64e68ef65bc8e2711edb32c)
+++ b/pkg/lockfile/parse-gradle-lock.go	(date 1675388734510)
@@ -3,7 +3,7 @@
 import (
 	"bufio"
 	"fmt"
-	"os"
+	"io"
 	"strings"
 )
 
@@ -37,14 +37,12 @@
 }
 
 func ParseGradleLock(pathToLockfile string) ([]PackageDetails, error) {
-	file, err := os.Open(pathToLockfile)
-	if err != nil {
-		return []PackageDetails{}, fmt.Errorf("could not open %s: %w", pathToLockfile, err)
-	}
-	defer file.Close()
+	return parseFileWithReader(pathToLockfile, ParseGradleLockWithReader)
+}
 
+func ParseGradleLockWithReader(r io.Reader) ([]PackageDetails, error) {
 	pkgs := make([]PackageDetails, 0)
-	scanner := bufio.NewScanner(file)
+	scanner := bufio.NewScanner(r)
 
 	for scanner.Scan() {
 		lockLine := strings.TrimSpace(scanner.Text())
@@ -54,7 +52,7 @@
 
 		pkg, err := parseToGradlePackageDetail(lockLine)
 		if err != nil {
-			fmt.Fprintf(os.Stderr, "failed to parse lockline: %s\n", err.Error())
+			// fmt.Fprintf(os.Stderr, "failed to parse lockline: %s\n", err.Error())
 			continue
 		}
 
@@ -62,7 +60,7 @@
 	}
 
 	if err := scanner.Err(); err != nil {
-		return []PackageDetails{}, fmt.Errorf("failed to read %s: %w", pathToLockfile, err)
+		return []PackageDetails{}, err
 	}
 
 	return pkgs, nil
Index: pkg/lockfile/parse-pubspec-lock_test.go
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/pkg/lockfile/parse-pubspec-lock_test.go b/pkg/lockfile/parse-pubspec-lock_test.go
--- a/pkg/lockfile/parse-pubspec-lock_test.go	(revision 5df9444f04bc86e3c64e68ef65bc8e2711edb32c)
+++ b/pkg/lockfile/parse-pubspec-lock_test.go	(date 1675387987210)
@@ -10,7 +10,7 @@
 
 	packages, err := lockfile.ParsePubspecLock("fixtures/pub/does-not-exist")
 
-	expectErrContaining(t, err, "could not read")
+	expectErrContaining(t, err, "could not open")
 	expectPackages(t, packages, []lockfile.PackageDetails{})
 }
 
@@ -219,3 +219,36 @@
 		},
 	})
 }
+
+// func TestParsePubspecLockWithReader_MixedPackages(t *testing.T) {
+// 	t.Parallel()
+//
+// 	packages, err := lockfile.ParsePubspecLockWithReader(openFileWithReader(t, "fixtures/pub/mixed-packages.lock"))
+//
+// 	if err != nil {
+// 		t.Errorf("Got unexpected error: %v", err)
+// 	}
+//
+// 	expectPackages(t, packages, []lockfile.PackageDetails{
+// 		{
+// 			Name:      "back_button_interceptor",
+// 			Version:   "6.0.1",
+// 			Ecosystem: lockfile.PubEcosystem,
+// 		},
+// 		{
+// 			Name:      "build_runner",
+// 			Version:   "2.2.1",
+// 			Ecosystem: lockfile.PubEcosystem,
+// 		},
+// 		{
+// 			Name:      "shelf",
+// 			Version:   "1.3.2",
+// 			Ecosystem: lockfile.PubEcosystem,
+// 		},
+// 		{
+// 			Name:      "shelf_web_socket",
+// 			Version:   "1.0.2",
+// 			Ecosystem: lockfile.PubEcosystem,
+// 		},
+// 	})
+// }
Index: pkg/lockfile/parse-pubspec-lock.go
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/pkg/lockfile/parse-pubspec-lock.go b/pkg/lockfile/parse-pubspec-lock.go
--- a/pkg/lockfile/parse-pubspec-lock.go	(revision 5df9444f04bc86e3c64e68ef65bc8e2711edb32c)
+++ b/pkg/lockfile/parse-pubspec-lock.go	(date 1675387907593)
@@ -3,7 +3,7 @@
 import (
 	"fmt"
 	"gopkg.in/yaml.v2"
-	"os"
+	"io"
 )
 
 type PubspecLockDescription struct {
@@ -61,18 +61,32 @@
 const PubEcosystem Ecosystem = "Pub"
 
 func ParsePubspecLock(pathToLockfile string) ([]PackageDetails, error) {
+	return parseFileWithReader(pathToLockfile, ParsePubspecLockWithReader)
+}
+
+
+// func ParsePubspecLockWithReader(r io.Reader) ([]PackageDetails, error) {
+// 	return parsePubspecLockContents(readBytes(r))
+// }
+
+// func ParsePubspecLock(pathToLockfile string) ([]PackageDetails, error) {
+// 	lockfileContents, err := os.ReadFile(pathToLockfile)
+//
+// 	if err != nil {
+// 		return []PackageDetails{}, fmt.Errorf("could not read %s: %w", pathToLockfile, err)
+// 	}
+//
+// 	return parsePubspecLockContents(lockfileContents)
+// }
+
+// func parsePubspecLockContents(lockfileContents []byte) ([]PackageDetails, error) {
+func ParsePubspecLockWithReader(r io.Reader) ([]PackageDetails, error) {
 	var parsedLockfile *PubspecLockfile
 
-	lockfileContents, err := os.ReadFile(pathToLockfile)
+	err := yaml.NewDecoder(r).Decode(&parsedLockfile)
 
 	if err != nil {
-		return []PackageDetails{}, fmt.Errorf("could not read %s: %w", pathToLockfile, err)
-	}
-
-	err = yaml.Unmarshal(lockfileContents, &parsedLockfile)
-
-	if err != nil {
-		return []PackageDetails{}, fmt.Errorf("could not parse %s: %w", pathToLockfile, err)
+		return []PackageDetails{}, fmt.Errorf("could not parse: %w", err)
 	}
 	if parsedLockfile == nil {
 		return []PackageDetails{}, nil
Index: pkg/lockfile/apk-installed.go
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/pkg/lockfile/apk-installed.go b/pkg/lockfile/apk-installed.go
--- a/pkg/lockfile/apk-installed.go	(revision 5df9444f04bc86e3c64e68ef65bc8e2711edb32c)
+++ b/pkg/lockfile/apk-installed.go	(date 1675383825828)
@@ -3,6 +3,7 @@
 import (
 	"bufio"
 	"fmt"
+	"io"
 	"os"
 	"sort"
 	"strings"
@@ -35,7 +36,7 @@
 	return groups
 }
 
-func parseApkPackageGroup(group []string, pathToLockfile string) PackageDetails {
+func parseApkPackageGroup(group []string) PackageDetails {
 	var pkg = PackageDetails{
 		Ecosystem: AlpineEcosystem,
 		CompareAs: AlpineEcosystem,
@@ -61,9 +62,8 @@
 
 		_, _ = fmt.Fprintf(
 			os.Stderr,
-			"warning: malformed APK installed file. Found no version number in record. Package %s. File: %s\n",
+			"warning: malformed APK installed file. Found no version number in record. Package %s.\n",
 			pkgPrintName,
-			pathToLockfile,
 		)
 	}
 
@@ -71,26 +71,23 @@
 }
 
 func ParseApkInstalled(pathToLockfile string) ([]PackageDetails, error) {
-	file, err := os.Open(pathToLockfile)
-	if err != nil {
-		return []PackageDetails{}, fmt.Errorf("could not open %s: %w", pathToLockfile, err)
-	}
-	defer file.Close()
+	return parseFileWithReader(pathToLockfile, ParseApkInstalledWithReader)
+}
 
-	scanner := bufio.NewScanner(file)
+func ParseApkInstalledWithReader(r io.Reader) ([]PackageDetails, error) {
+	scanner := bufio.NewScanner(r)
 
 	packageGroups := groupApkPackageLines(scanner)
 
 	packages := make([]PackageDetails, 0, len(packageGroups))
 
 	for _, group := range packageGroups {
-		pkg := parseApkPackageGroup(group, pathToLockfile)
+		pkg := parseApkPackageGroup(group)
 
 		if pkg.Name == "" {
 			_, _ = fmt.Fprintf(
 				os.Stderr,
-				"warning: malformed APK installed file. Found no package name in record. File: %s\n",
-				pathToLockfile,
+				"warning: malformed APK installed file. Found no package name in record.\n",
 			)
 
 			continue
@@ -100,7 +97,7 @@
 	}
 
 	if err := scanner.Err(); err != nil {
-		return packages, fmt.Errorf("error while scanning %s: %w", pathToLockfile, err)
+		return []PackageDetails{}, err
 	}
 
 	return packages, nil
Index: pkg/lockfile/helpers_test.go
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/pkg/lockfile/helpers_test.go b/pkg/lockfile/helpers_test.go
--- a/pkg/lockfile/helpers_test.go	(revision 5df9444f04bc86e3c64e68ef65bc8e2711edb32c)
+++ b/pkg/lockfile/helpers_test.go	(date 1675386229714)
@@ -3,10 +3,24 @@
 import (
 	"fmt"
 	"github.com/google/osv-scanner/pkg/lockfile"
+	"io"
+	"os"
 	"strings"
 	"testing"
 )
 
+func openFileWithReader(t *testing.T, pathToFile string) io.Reader {
+	t.Helper()
+
+	file, err := os.Open(pathToFile)
+	if err != nil {
+		t.Fatalf("could not open %s: %v", pathToFile, err)
+	}
+	defer file.Close()
+
+	return file
+}
+
 func expectErrContaining(t *testing.T, err error, str string) {
 	t.Helper()
 
Index: pkg/lockfile/parse-mix-lock.go
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/pkg/lockfile/parse-mix-lock.go b/pkg/lockfile/parse-mix-lock.go
--- a/pkg/lockfile/parse-mix-lock.go	(revision 5df9444f04bc86e3c64e68ef65bc8e2711edb32c)
+++ b/pkg/lockfile/parse-mix-lock.go	(date 1675383495999)
@@ -3,6 +3,7 @@
 import (
 	"bufio"
 	"fmt"
+	"io"
 	"os"
 	"regexp"
 	"strings"
@@ -11,15 +12,13 @@
 const MixEcosystem Ecosystem = "Hex"
 
 func ParseMixLock(pathToLockfile string) ([]PackageDetails, error) {
-	file, err := os.Open(pathToLockfile)
-	if err != nil {
-		return []PackageDetails{}, fmt.Errorf("could not open %s: %w", pathToLockfile, err)
-	}
-	defer file.Close()
+	return parseFileWithReader(pathToLockfile, ParseMixLockWithReader)
+}
 
+func ParseMixLockWithReader(r io.Reader) ([]PackageDetails, error) {
 	re := regexp.MustCompile(`^ +"(\w+)": \{.+,$`)
 
-	scanner := bufio.NewScanner(file)
+	scanner := bufio.NewScanner(r)
 
 	var packages []PackageDetails
 
@@ -70,7 +69,7 @@
 	}
 
 	if err := scanner.Err(); err != nil {
-		return []PackageDetails{}, fmt.Errorf("error while scanning %s: %w", pathToLockfile, err)
+		return []PackageDetails{}, err
 	}
 
 	return packages, nil
Index: pkg/lockfile/types.go
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/pkg/lockfile/types.go b/pkg/lockfile/types.go
--- a/pkg/lockfile/types.go	(revision 5df9444f04bc86e3c64e68ef65bc8e2711edb32c)
+++ b/pkg/lockfile/types.go	(date 1675374961495)
@@ -1,5 +1,7 @@
 package lockfile
 
+import "io"
+
 type PackageDetails struct {
 	Name      string    `json:"name"`
 	Version   string    `json:"version"`
@@ -11,3 +13,4 @@
 type Ecosystem string
 
 type PackageDetailsParser = func(pathToLockfile string) ([]PackageDetails, error)
+type PackageDetailsParserWithReader = func(r io.Reader) ([]PackageDetails, error)

@G-Rath
Copy link
Collaborator

G-Rath commented Feb 5, 2023

Ok so I think I've got a path forward for supporting JSON/YAML/TOML/XML parsers with io.Reader as they all seem to support a Decoder that can be passed an io.Reader, though it looks like they error on empty files which is a little annoying but probably something we'll just have to live with (turns out that's just the YAML decoder, and I found a way to handle that)

However, I think #183 should be discussed/landed first so am waiting on that before continuing.

@G-Rath
Copy link
Collaborator

G-Rath commented Feb 5, 2023

Have opened #189 😅

@oliverchang
Copy link
Collaborator

Thanks @G-Rath! Let's use these to start discussions, but I'd maybe recommend not doing too much further work on these until we understand all the requirements :) Hopefully in the next few weeks -- we have some potential internal consumers as well who will have input.

@G-Rath
Copy link
Collaborator

G-Rath commented Feb 6, 2023

@oliverchang sure, though my PR covers exactly what @picatz has asked for, and that I think is a no brainer.

Combined with supporting "diagnostics" in my other PR (which allows the parsers to return any extra arbitrary data folks might want), I'm not sure what uses cases couldn't be met without further internal changes 🤷‍♂️

Still, happy to wait until you've had those internal conversation 🙂

@oliverchang
Copy link
Collaborator

@oliverchang sure, though my PR covers exactly what @picatz has asked for, and that I think is a no brainer.

Combined with supporting "diagnostics" in my other PR (which allows the parsers to return any extra arbitrary data folks might want), I'm not sure what uses cases couldn't be met without further internal changes 🤷‍♂️

Still, happy to wait until you've had those internal conversation 🙂

e.g. We may want e.g. io.ReadSeekCloser instead of just io.Reader. Changing that would break the interface.

Thanks again for working on this!

@G-Rath
Copy link
Collaborator

G-Rath commented Feb 6, 2023

I'm not sure why you'd want to do that though when none of the lockfile parsers need that? So you'd just be restricting their input, and you should still be able to pass in a ReadSeekCloser becuase it conforms to the Reader interface.

My understanding is that Reader represents the smallest interface required by all the parsers and that anything that implements Reader (including ReadSeekCloser, File, etc) can be passed in, which is why it makes sense to support it like this - everything else (like reading from a file on disk) is building on top of that and so could be supported in userland with minimal code, or via a helper (i.e. ParseLockFile) in the library...

@oliverchang
Copy link
Collaborator

I'm not sure why you'd want to do that though when none of the lockfile parsers need that? So you'd just be restricting their input, and you should still be able to pass in a ReadSeekCloser becuase it conforms to the Reader interface.

My understanding is that Reader represents the smallest interface required by all the parsers, which is why it makes sense to support it like this - everything else (like reading from a file on disk) is building on top of that and so could be supported in userland with minimal code, or via a helper (i.e. ParseLockFile) in the library...

Again, part of our internal conversations :) (and this is just one example). Let's circle back one we have more details / clarity.

@oliverchang
Copy link
Collaborator

Sorry for the slow replies here. io.Reader should suffice here.

That said, we should start defining interfaces for the parsers here to make the namespace a bit cleaner + make it clearer what exactly needs to be implemented for a particular lockfile format. Something like:

interface Parser {
  Parse(r io.Reader) []PackageDetails
}

We should still keep the individual ParseApkInstalled etc around for compatibility and convenience. They will just be lightweight wrappers on top of the interface methods. If someone wants to use the lower level io.Reader interfaces, they can just use the ones defined as part of the interfaces.

@G-Rath @another-rex WDYT?

@G-Rath
Copy link
Collaborator

G-Rath commented Mar 20, 2023

Are you meaning like:

interface PackageLockParser {
  Parse(r io.Reader) []PackageDetails

fwiw, #260 has made things a bit tricker - we now also need to expose a way for parsers to read "another file"; we shouldn't make this change without resolving this too, because it's important for supporting container scanning.

@oliverchang
Copy link
Collaborator

Are you meaning like:

interface PackageLockParser {
  Parse(r io.Reader) []PackageDetails

fwiw, #260 has made things a bit tricker - we now also need to expose a way for parsers to read "another file"; we shouldn't make this change without resolving this too, because it's important for supporting container scanning.

Ah good callout. It seems like to fully support this we will need some kind of lightweight filesystem abstraction as well then. As you mention this does seem to tie in closely to container scanning as well.

@oliverchang
Copy link
Collaborator

oliverchang commented Jun 28, 2023

Pushing this thread along a bit again. How about something along the lines of:

interface FSAccessor {
  // path should be relative to the main input path. 
  Get(path string) io.ReaderCloser
}

type Input struct {
  Reader io.Reader  // the main input
  FS FSAccessor  // optional, but this is required in some cases for certain ecosystems. An error should be returned by the Parser if this is required during parsing and not provided. 
} 

interface Parser {
  // Heuristic to check if a given path/filename is supported by the parser. 
  ShouldParse(path string) bool
  Parse(input Input) ([]PackageDetails, error)
}

Having the main input be a struct also helps with future extensibility without breaking compatibility.

@G-Rath
Copy link
Collaborator

G-Rath commented Jun 29, 2023

FSAccessor might be the better way to go, but my latest iteration has been to implement a ParsableFile interface: G-Rath/osv-detector@0c403c2#diff-b4fdf6ad8e8b187ab8a980013908e29bff84155148148811e97f5f41080f790fR52-R58

@picatz
Copy link
Author

picatz commented Jun 29, 2023

Just another option to consider, but the io/fs package might be useful in this case? io/fs.FS seems very similar to the FSAccessor interface.

@another-rex
Copy link
Collaborator

I'm thinking a mix between the two options, something like this:

// ParsableFile is an abstraction for a file that has been opened for parsing
// to create a Lockfile, and that knows how to open other ParsableFiles
// relative to itself.
type ParsableFile interface {
  io.ReadCloser

  // if string is a relative path, open relative to the current file
  // otherwise open the file at the absolute path
  Open(string) (ParsableFile, error) 

  Path() string
}


type Parser interface {
  // Heuristic to check if a given path/filename is supported by the parser. 
  ShouldParse(path string) bool
  Parse(input ParsableFile) ([]PackageDetails, error)
}

This allows the implementer to also support specifying files on absolute paths.

@another-rex
Copy link
Collaborator

The interface we are planning on going with:

var ErrNotSupported = errors.New("this file does not support opening files")

// DependenciesFile is an abstraction for a file that has been opened for extraction, and that knows how to open other DependenciesFile relative to itself.
type DependenciesFile interface {
  io.ReadCloser

  // if string is a relative path, open relative to the current file
  // otherwise open the file at the absolute path
  //
  // If the DependenciesFile does not implement Open, Extractor should 
  // record the failure to open the file, and return the partial list
  // of PackageDetails, in addition to ErrNotSupported error
  Open(string) (DependenciesFile, error)
  
  Path() string
}


type Extractor interface {
  // Heuristic to check if a given path/filename is supported by the parser. 
  ShouldExtract(path string) bool
  Extract(input DependenciesFile) ([]PackageDetails, error)
}

This allows a lot of flexibility to the implementer for how to open files, including opening with custom fs.FS implementation.

@another-rex
Copy link
Collaborator

A final update to the design, the main change is to separate out DependenciesFile with one that has to be closed, and one that doesn't. This helps prevent hard to catch not closing/double close errors when using the interface.

var ErrNotSupported = errors.New("this file does not support opening files")

// DepFile is an abstraction for a file that has been opened for extraction, and that knows how to open other DependenciesFile relative to itself.
type DepFile interface {
  io.Reader

  // if string is a relative path, open relative to the current file
  // otherwise open the file at the absolute path
  //
  // If the DependenciesFile does not implement Open, Extractor should 
  // record the failure to open the file, and return the partial list
  // of PackageDetails, in addition to ErrNotSupported error
  Open(string) (NestedDepFile, error)
  
  Path() string
}

// NestedDepFile is an abstraction for a file that has been opened while extracting another file, and would need to be closed.
type NestedDepFile interface {
  io.Closer
  DepFile
}

type Extractor interface {
  // Heuristic to check if a given path/filename is supported by the parser. 
  ShouldExtract(path string) bool
  Extract(input DepFile) ([]PackageDetails, error)
}

Both DepFile and NestedDepFile can be implemented by different structures, or by one struct like LocalFile:

// A LocalFile represents a file that exists on the local filesystem.
type LocalFile struct {
	io.ReadCloser

	path string
}

func (f LocalFile) Open(path string) (NestedParsableFile, error) {
	...
}

func (f LocalFile) Path() string { return f.path }

var _ ParsableFile = LocalFile{}
var _ NestedParsableFile = LocalFile{}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants