Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update AzureSearchAdminKey detector #3634

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
106 changes: 61 additions & 45 deletions pkg/detectors/azuresearchadminkey/azuresearchadminkey.go
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,8 @@ package azuresearchadminkey
import (
"context"
"fmt"
"io"
rgmz marked this conversation as resolved.
Show resolved Hide resolved
"net/http"
"strings"

regexp "github.com/wasilibs/go-re2"

Expand All @@ -23,79 +23,95 @@ var _ detectors.Detector = (*Scanner)(nil)

var (
defaultClient = common.SaneHttpClient()
// Make sure that your group is surrounded in boundary characters such as below to reduce false positives.
keyPat = regexp.MustCompile(detectors.PrefixRegex([]string{"azure"}) + `\b([0-9a-zA-Z]{52})\b`)
servicePat = regexp.MustCompile(detectors.PrefixRegex([]string{"azure"}) + `\b([0-9a-zA-Z]{7,40})\b`)

servicePat = regexp.MustCompile(`\b([a-z0-9][a-z0-9-]{5,58}[a-z0-9])\.search\.windows\.net`)
keyPat = regexp.MustCompile(detectors.PrefixRegex([]string{"azure", "windows.net"}) + `\b([a-zA-Z0-9]{52})\b`)
)

// Keywords are used for efficiently pre-filtering chunks.
// Use identifiers in the secret preferably, or the provider name.
func (s Scanner) Keywords() []string {
return []string{"azure"}
return []string{"search.windows.net"}
}

func (s Scanner) Type() detectorspb.DetectorType {
return detectorspb.DetectorType_AzureSearchAdminKey
}

func (s Scanner) Description() string {
return "Azure Search is a search-as-a-service solution that allows developers to incorporate search capabilities into their applications. Azure Search Admin Keys can be used to manage and query search services."
}

// FromData will find and optionally verify AzureSearchAdminKey secrets in a given set of bytes.
func (s Scanner) FromData(ctx context.Context, verify bool, data []byte) (results []detectors.Result, err error) {
dataStr := string(data)

matches := keyPat.FindAllStringSubmatch(dataStr, -1)
serviceMatches := servicePat.FindAllStringSubmatch(dataStr, -1)

for _, match := range matches {
if len(match) != 2 {
// Deduplicate results.
serviceMatches := make(map[string]struct{})
for _, matches := range servicePat.FindAllStringSubmatch(dataStr, -1) {
serviceMatches[matches[1]] = struct{}{}
}
keyMatches := make(map[string]struct{})
for _, matches := range keyPat.FindAllStringSubmatch(dataStr, -1) {
k := matches[1]
if detectors.StringShannonEntropy(k) < 4 {
continue
}
resMatch := strings.TrimSpace(match[1])
for _, serviceMatch := range serviceMatches {
if len(serviceMatch) != 2 {
continue
}
resServiceMatch := strings.TrimSpace(serviceMatch[1])
keyMatches[k] = struct{}{}
}

s1 := detectors.Result{
for key := range keyMatches {
for service := range serviceMatches {
r := detectors.Result{
DetectorType: detectorspb.DetectorType_AzureSearchAdminKey,
Raw: []byte(resMatch),
RawV2: []byte(resMatch + resServiceMatch),
Raw: []byte(key),
RawV2: []byte(`{"service":"` + service + `","key":"` + key + `"}`),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing the RawV2 value will break some enterprise customers, as we currently have secrets verified using the old key+service format. This change would orphan previously verified credentials and create duplicates. Is there a reason we can't continue supporting the old format?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason we can't continue supporting the old format?

Obviously you get the final say here, but #2128 has been a PITA for me and others using OSS.

TL;DR Most of the existing RawV2 values are suboptional because they smash together text (sometimes without a delimiter) and are devoid of context/keywords. This makes it impossible to programmatically re-construct and re-verify results.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My guess is that when RawV2 was proposed, we didn't account for its visibility in OSS reporting. Wonder if we can find a solution that preserves readability for OSS while avoiding duplicates on the enterprise side?

This likely requires a broader discussion, and I think @zricethezav will have the most input, but I agree that combining multiple credentials makes them difficult to interpret and largely ineffective

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My guess is that when RawV2 was proposed, we didn't account for its visibility in OSS reporting. Wonder if we can find a solution that preserves readability for OSS while avoiding duplicates on the enterprise side?

I can't speak to the impact on enterprise. I can revert the change to RawV2, though I'd prefer to improve it (eventually).

}

if verify {
client := s.client
if client == nil {
client = defaultClient
}
req, err := http.NewRequestWithContext(ctx, "GET", "https://"+resServiceMatch+".search.windows.net/servicestats?api-version=2023-10-01-Preview", nil)
if err != nil {
continue
}
req.Header.Add("api-key", resMatch)

res, err := client.Do(req)
if err == nil {
defer res.Body.Close()
if res.StatusCode >= 200 && res.StatusCode < 300 {
s1.Verified = true
} else if res.StatusCode == 401 || res.StatusCode == 403 {
// The secret is determinately not verified (nothing to do)
} else {
err = fmt.Errorf("unexpected HTTP response status %d", res.StatusCode)
s1.SetVerificationError(err, resMatch)
}
} else {
s1.SetVerificationError(err, resMatch)
}

isVerified, verificationErr := verifyMatch(ctx, client, service, key)
r.Verified = isVerified
r.SetVerificationError(verificationErr, key)
}

results = append(results, s1)
results = append(results, r)
if r.Verified {
break
}
}
}

return results, nil
}

func (s Scanner) Type() detectorspb.DetectorType {
return detectorspb.DetectorType_AzureSearchAdminKey
}
func verifyMatch(ctx context.Context, client *http.Client, service string, key string) (bool, error) {
req, err := http.NewRequestWithContext(ctx, "GET", "https://"+service+".search.windows.net/servicestats?api-version=2023-10-01-Preview", nil)
if err != nil {
return false, err
}
req.Header.Set("api-key", key)

func (s Scanner) Description() string {
return "Azure Search is a search-as-a-service solution that allows developers to incorporate search capabilities into their applications. Azure Search Admin Keys can be used to manage and query search services."
res, err := client.Do(req)
if err != nil {
return false, err
}
defer func() {
_, _ = io.Copy(io.Discard, res.Body)
_ = res.Body.Close()
}()

switch res.StatusCode {
case http.StatusOK:
return true, nil
case http.StatusUnauthorized, http.StatusForbidden:
// The secret is determinately not verified.
return false, nil
default:
return false, fmt.Errorf("unexpected HTTP response status %d", res.StatusCode)
}
}
55 changes: 36 additions & 19 deletions pkg/detectors/azuresearchadminkey/azuresearchadminkey_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -10,19 +10,6 @@ import (
"github.com/trufflesecurity/trufflehog/v3/pkg/engine/ahocorasick"
)

var (
validPattern = `
azure:
azureKey: wRRPyhjv8m6JGRujUUrPKa8d3rJ0mrGAxhmqf3A68OgZmlWUJyma
azureService: TestingService01
`
invalidPattern = `
azure:
Key: wRRPyhjv8m6JGRujUUr-PK#a8d3rJ0mrGAxhmqf3A68OgZmlWUJyma
Service: TS01
`
)

func TestAzureSearchAdminKey_Pattern(t *testing.T) {
d := Scanner{}
ahoCorasickCore := ahocorasick.NewAhoCorasickCore([]detectors.Detector{d})
Expand All @@ -33,14 +20,44 @@ func TestAzureSearchAdminKey_Pattern(t *testing.T) {
want []string
}{
{
name: "valid pattern",
input: validPattern,
want: []string{"wRRPyhjv8m6JGRujUUrPKa8d3rJ0mrGAxhmqf3A68OgZmlWUJymaTestingService01", "wRRPyhjv8m6JGRujUUrPKa8d3rJ0mrGAxhmqf3A68OgZmlWUJymaazureKey"},
name: "valid pattern",
input: `
azure:
azureKey: wRRPyhjv8m6JGRujUUrPKa8d3rJ0mrGAxhmqf3A68OgZmlWUJyma
azureService: testingservice01.search.windows.net
`,
want: []string{`{"service":"testingservice01","key":"wRRPyhjv8m6JGRujUUrPKa8d3rJ0mrGAxhmqf3A68OgZmlWUJyma"}`},
},
{
name: "jupyter notebook",
input: `
{
"cell_type": "code",
"execution_count": 7,
"id": "b188568f",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"https://news-search.search.windows.net azureblob-index 2021-04-30-Preview iOExxhJ2A2wjdAGTjsMxASsJj3y3zUR54kjNTNpW9hAzSeD8PE3k\n"
]
}
],
"source": [
"print(os.getenv(\"AZURE_COGNITIVE_SEARCH_ENDPOINT\"), os.getenv('AZURE_COGNITIVE_SEARCH_INDEX_NAME'), os.getenv('AZURE_COGNITIVE_SEARCH_API_VERSION'), os.getenv(\"AZURE_COGNITIVE_SEARCH_KEY\"))"
]`,
want: []string{`{"service":"news-search","key":"iOExxhJ2A2wjdAGTjsMxASsJj3y3zUR54kjNTNpW9hAzSeD8PE3k"}`},
},
{
name: "invalid pattern",
input: invalidPattern,
want: nil,
name: "invalid pattern",
input: `
azure:
Key: wRRPyhjv8m6JGRujUUr-PK#a8d3rJ0mrGAxhmqf3A68OgZmlWUJyma
Service: TS01.search.windows.net
`,
want: nil,
},
}

Expand Down