Files
opencommit-gitea/src/generateCommitMessageFromGitDiff.ts
T
Jakub Rosa f814c6b89d add support for Azure OpenAI API - continue MR 167 (#324)
*  feat(api.ts): add support for Azure OpenAI API

The Azure OpenAI API is now supported in addition to the OpenAI API. The API type can be specified in the configuration file using the OPENAI_API_TYPE key. If the key is not specified, the default value is 'openai'. The AzureOpenAIApi class is added to the utils folder to handle the Azure OpenAI API calls. The createChatCompletion method is implemented in the AzureOpenAIApi class to handle the chat completion requests. The method is called in the generateCommitMessage method in the OpenAi class if the apiType is set to 'azure'.

* 🐛 fix(AzureOpenAI.ts): fix import path for AxiosRequestConfig to avoid conflicts with openai's axios dependency

In AzureOpenAI.ts, the import path for AxiosRequestConfig was changed to avoid conflicts with openai's axios dependency, which was causing lint errors.

* 🔧 fix(AzureOpenAI.ts): import RequiredError to fix error handling and remove commented out debug code

The RequiredError class was not being imported from the openai/dist/base module, causing errors to be thrown incorrectly. This has been fixed by importing the RequiredError class. Debug code has been removed and comments have been updated to reflect the changes made.

* 🔇 chore(AzureOpenAI.ts): remove console.log statement and translate Japanese comment

The commented console.log statement was removed to improve code cleanliness.

* 🔥 refactoring(api.ts, AzureOpenAI.ts): Leverage openai npm package
🐛 fix(config.ts): API Key string validation

*  (README.md): update opencommit command to set OCO_AI_PROVIDER instead of OPENAI_API_TYPE to improve consistency and clarity in configuration
♻️ (config.ts): update OCO_AI_PROVIDER enum in configValidators to include 'azure' and remove unnecessary conditionals to improve maintainability and extensibility
⬆️ (config.ts): add OCO_AZURE_API_VERSION to ConfigType and getConfig() to support new azure api version configuration
♻️ (engine/ollama.ts): add space between temperature and top_p properties to improve readability
♻️ (engine/openAi.ts): refactor OpenAi class to improve readability and maintainability by extracting configuration logic into separate switch statement
🔧 (generateCommitMessageFromGitDiff.ts): refactor MAX_TOKENS_INPUT and MAX_TOKENS_OUTPUT lines to improve readability
🔧 (generateCommitMessageFromGitDiff.ts): refactor generateCommitMessageByDiff and getMessagesPromisesByChangesInFile functions to use destructuring and improve readability
♻️ (generateCommitMessageFromGitDiff.ts): refactor getCommitMsgsPromisesFromFileDiffs function to use destructuring and improve readability
📝 (modules/commitlint/config.ts): add missing types to function parameters and improve readability by removing unnecessary comments and whitespace
📝 (modules/commitlint/utils.ts): fix indentation and add missing types to function parameters
📝 (prompts.ts): update INIT_MAIN_PROMPT description to include clarification on the use of present tense and line length
📝 (version.ts): fix import statements and add missing types to function parameters

*  (package.json): add @azure/openai dependency to support integration with Azure AI services
🔧 (config.ts): change CONFIG_KEYS.OCO_AZURE_API_VERSION to CONFIG_KEYS.OCO_AZURE_ENDPOINT to improve semantics and allow configuration of Azure endpoint URL
♻️ (config.ts): refactor configValidators to use OCO_AZURE_ENDPOINT instead of OCO_AZURE_API_VERSION and update validation message for OCO_AZURE_ENDPOINT
 (config.ts): add OCO_AZURE_ENDPOINT to getConfig function to retrieve Azure endpoint configuration from environment variables
 (azure.ts): introduce a new file azure.ts to implement Azure AI engine
 (azure.ts): implement generateCommitMessage function in Azure AI engine
 (prompts.ts): add a new line to INIT_MAIN_PROMPT to mention that changes within a single file should be described with a single commit message
♻️ (engine.ts): refactor getEngine function to add support for 'azure' as the AI provider and return the azure engine

* 📝 (prompts.ts): remove unnecessary information about crafting a concise commit message with a one single message for OCO_ONE_LINE_COMMIT configuration

* 3.0.14 (#333)

* test:  add the first E2E test and configuration to CI (#316)

* add tests

* Add push config (#220)

* feat: add instructions and support for configuring gpt-4-turbo (#320)

* 3.0.12

* build

* feat: add 'gpt-4-turbo' to supported models in README and config validation

---------

Co-authored-by: di-sukharev <dim.sukharev@gmail.com>

*  fix the broken E2E tests due to the addition of OCO_GITPUSH (#321)

* test(oneFile.test.ts): update test expectations to match new push prompt text

* build

* Feat: Add Claude 3 support (#318)

* 3.0.12

* build

* feat: anthropic claude 3 support

* fix: add system prompt

* fix: type check

* fix: package version

* fix: update anthropic for dependency bug fix

* feat: update build files

* feat: update version number

---------

Co-authored-by: di-sukharev <dim.sukharev@gmail.com>

* 🐛bug fix: enable to use the new format of OpenAI's project API Key (#328)

* fix(config.ts): remove validation for OCO_OPENAI_API_KEY length to accommodate variable key lengths

* build

* ♻️ refactor(config.ts): Addition of UnitTest environment and unittest for commands/config.ts#getConfig (#330)

* feat(jest.config.ts): update jest preset for TS ESM support and ignore patterns
feat(package.json): add test:unit script with NODE_OPTIONS for ESM
refactor(src/commands/config.ts): improve dotenv usage with dynamic paths
feat(src/commands/config.ts): allow custom config and env paths in getConfig
refactor(src/commands/config.ts): streamline environment variable access

feat(test/unit): add unit tests for config handling and utility functions

- Implement unit tests for `getConfig` function to ensure correct behavior
  in various scenarios including default values, global config, and local
  env file precedence.
- Add utility function `prepareFile` for creating temporary files during
  tests, facilitating testing of file-based configurations.

* feat(e2e.yml): add unit-test job to GitHub Actions for running unit tests on pull requests

* ci(test.yml): add GitHub Actions workflow for unit and e2e tests on pull requests

* refactor(config.ts): streamline environment variable access using process.env directly
test(config.test.ts): add setup and teardown for environment variables in tests to ensure test isolation

* feat(package.json): add `test:all` script to run all tests in Docker
refactor(package.json): consolidate Docker build steps into `test:docker-build` script for DRY principle
fix(package.json): ensure `test:unit:docker` and `test:e2e:docker` scripts use the same Docker image and remove container after run
chore(test/Dockerfile): remove default CMD to allow dynamic test script execution in Docker

* refactor(config.test.ts): anonymize API keys in tests for better security practices

* feat(config.test.ts): add tests for OCO_ANTHROPIC_API_KEY configuration

* refactor(config.ts): streamline path imports and remove unused DotenvParseOutput

- Simplify path module imports by removing default import and using named imports for `pathJoin` and `pathResolve`.
- Remove unused `DotenvParseOutput` import to clean up the code.

* refactor(config.test.ts): simplify API key mock values for clarity in tests

* test(config.test.ts): remove tests for default config values and redundant cases

- Removed tests that checked for default config values when no config or env files are present, as these scenarios are now handled differently.
- Eliminated tests for empty global config and local env files to streamline testing focus on actual config loading logic.
- Removed test for prioritizing local env over global config due to changes in config loading strategy, simplifying the configuration management.

* new version

---------

Co-authored-by: Takanori Matsumoto <matscube@gmail.com>
Co-authored-by: Moret84 <aurelienrivet@hotmail.fr>
Co-authored-by: yowatari <4982161+YOwatari@users.noreply.github.com>
Co-authored-by: metavind <94786679+metavind@users.noreply.github.com>

* 3.0.15

* build

* 🐛 (prepare-commit-msg-hook.ts): improve error message to cover missing OCO_ANTHROPIC_API_KEY and OCO_AZURE_API_KEY in addition to OCO_OPENAI_API_KEY

* 🐛 (azure.ts): fix check for OCO_AI_PROVIDER to properly assign provider variable
🐛 (azure.ts): initialize OpenAIClient only if provider is 'azure'
🔧 (Dockerfile): rearrange instructions to optimize caching by copying package.json and package-lock.json first before running npm ci and copying the rest of the files
🔧 (e2e/noChanges.test.ts): remove unnecessary line break
🔧 (e2e/oneFile.test.ts): remove unnecessary line break

---------

Co-authored-by: Takuya Ono <takuya-o@users.osdn.me>
Co-authored-by: GPT10 <57486732+di-sukharev@users.noreply.github.com>
Co-authored-by: Takanori Matsumoto <matscube@gmail.com>
Co-authored-by: Moret84 <aurelienrivet@hotmail.fr>
Co-authored-by: yowatari <4982161+YOwatari@users.noreply.github.com>
Co-authored-by: metavind <94786679+metavind@users.noreply.github.com>
Co-authored-by: di-sukharev <dim.sukharev@gmail.com>
2024-05-09 11:23:00 +03:00

214 lines
6.0 KiB
TypeScript

import {
ChatCompletionRequestMessage,
ChatCompletionRequestMessageRoleEnum
} from 'openai';
import { DEFAULT_TOKEN_LIMITS, getConfig } from './commands/config';
import { getMainCommitPrompt } from './prompts';
import { mergeDiffs } from './utils/mergeDiffs';
import { tokenCount } from './utils/tokenCount';
import { getEngine } from './utils/engine';
const config = getConfig();
const MAX_TOKENS_INPUT =
config?.OCO_TOKENS_MAX_INPUT || DEFAULT_TOKEN_LIMITS.DEFAULT_MAX_TOKENS_INPUT;
const MAX_TOKENS_OUTPUT =
config?.OCO_TOKENS_MAX_OUTPUT ||
DEFAULT_TOKEN_LIMITS.DEFAULT_MAX_TOKENS_OUTPUT;
const generateCommitMessageChatCompletionPrompt = async (
diff: string,
fullGitMojiSpec: boolean
): Promise<Array<ChatCompletionRequestMessage>> => {
const INIT_MESSAGES_PROMPT = await getMainCommitPrompt(fullGitMojiSpec);
const chatContextAsCompletionRequest = [...INIT_MESSAGES_PROMPT];
chatContextAsCompletionRequest.push({
role: ChatCompletionRequestMessageRoleEnum.User,
content: diff
});
return chatContextAsCompletionRequest;
};
export enum GenerateCommitMessageErrorEnum {
tooMuchTokens = 'TOO_MUCH_TOKENS',
internalError = 'INTERNAL_ERROR',
emptyMessage = 'EMPTY_MESSAGE',
outputTokensTooHigh = `Token limit exceeded, OCO_TOKENS_MAX_OUTPUT must not be much higher than the default ${DEFAULT_TOKEN_LIMITS.DEFAULT_MAX_TOKENS_OUTPUT} tokens.`
}
const ADJUSTMENT_FACTOR = 20;
export const generateCommitMessageByDiff = async (
diff: string,
fullGitMojiSpec: boolean
): Promise<string> => {
try {
const INIT_MESSAGES_PROMPT = await getMainCommitPrompt(fullGitMojiSpec);
const INIT_MESSAGES_PROMPT_LENGTH = INIT_MESSAGES_PROMPT.map(
(msg) => tokenCount(msg.content) + 4
).reduce((a, b) => a + b, 0);
const MAX_REQUEST_TOKENS =
MAX_TOKENS_INPUT -
ADJUSTMENT_FACTOR -
INIT_MESSAGES_PROMPT_LENGTH -
MAX_TOKENS_OUTPUT;
if (tokenCount(diff) >= MAX_REQUEST_TOKENS) {
const commitMessagePromises = await getCommitMsgsPromisesFromFileDiffs(
diff,
MAX_REQUEST_TOKENS,
fullGitMojiSpec
);
const commitMessages = [];
for (const promise of commitMessagePromises) {
commitMessages.push(await promise);
await delay(2000);
}
return commitMessages.join('\n\n');
}
const messages = await generateCommitMessageChatCompletionPrompt(
diff,
fullGitMojiSpec
);
const engine = getEngine();
const commitMessage = await engine.generateCommitMessage(messages);
if (!commitMessage)
throw new Error(GenerateCommitMessageErrorEnum.emptyMessage);
return commitMessage;
} catch (error) {
throw error;
}
};
function getMessagesPromisesByChangesInFile(
fileDiff: string,
separator: string,
maxChangeLength: number,
fullGitMojiSpec: boolean
) {
const hunkHeaderSeparator = '@@ ';
const [fileHeader, ...fileDiffByLines] = fileDiff.split(hunkHeaderSeparator);
// merge multiple line-diffs into 1 to save tokens
const mergedChanges = mergeDiffs(
fileDiffByLines.map((line) => hunkHeaderSeparator + line),
maxChangeLength
);
const lineDiffsWithHeader = [];
for (const change of mergedChanges) {
const totalChange = fileHeader + change;
if (tokenCount(totalChange) > maxChangeLength) {
// If the totalChange is too large, split it into smaller pieces
const splitChanges = splitDiff(totalChange, maxChangeLength);
lineDiffsWithHeader.push(...splitChanges);
} else {
lineDiffsWithHeader.push(totalChange);
}
}
const engine = getEngine();
const commitMsgsFromFileLineDiffs = lineDiffsWithHeader.map(
async (lineDiff) => {
const messages = await generateCommitMessageChatCompletionPrompt(
separator + lineDiff,
fullGitMojiSpec
);
return engine.generateCommitMessage(messages);
}
);
return commitMsgsFromFileLineDiffs;
}
function splitDiff(diff: string, maxChangeLength: number) {
const lines = diff.split('\n');
const splitDiffs = [];
let currentDiff = '';
if (maxChangeLength <= 0) {
throw new Error(GenerateCommitMessageErrorEnum.outputTokensTooHigh);
}
for (let line of lines) {
// If a single line exceeds maxChangeLength, split it into multiple lines
while (tokenCount(line) > maxChangeLength) {
const subLine = line.substring(0, maxChangeLength);
line = line.substring(maxChangeLength);
splitDiffs.push(subLine);
}
// Check the tokenCount of the currentDiff and the line separately
if (tokenCount(currentDiff) + tokenCount('\n' + line) > maxChangeLength) {
// If adding the next line would exceed the maxChangeLength, start a new diff
splitDiffs.push(currentDiff);
currentDiff = line;
} else {
// Otherwise, add the line to the current diff
currentDiff += '\n' + line;
}
}
// Add the last diff
if (currentDiff) {
splitDiffs.push(currentDiff);
}
return splitDiffs;
}
export const getCommitMsgsPromisesFromFileDiffs = async (
diff: string,
maxDiffLength: number,
fullGitMojiSpec: boolean
) => {
const separator = 'diff --git ';
const diffByFiles = diff.split(separator).slice(1);
// merge multiple files-diffs into 1 prompt to save tokens
const mergedFilesDiffs = mergeDiffs(diffByFiles, maxDiffLength);
const commitMessagePromises = [];
for (const fileDiff of mergedFilesDiffs) {
if (tokenCount(fileDiff) >= maxDiffLength) {
// if file-diff is bigger than gpt context — split fileDiff into lineDiff
const messagesPromises = getMessagesPromisesByChangesInFile(
fileDiff,
separator,
maxDiffLength,
fullGitMojiSpec
);
commitMessagePromises.push(...messagesPromises);
} else {
const messages = await generateCommitMessageChatCompletionPrompt(
separator + fileDiff,
fullGitMojiSpec
);
const engine = getEngine();
commitMessagePromises.push(engine.generateCommitMessage(messages));
}
}
return commitMessagePromises;
};
function delay(ms: number) {
return new Promise((resolve) => setTimeout(resolve, ms));
}