Определите язык текста с помощью ML Kit на iOS. Определите язык текста с помощью ML Kit на iOS.
Оптимизируйте свои подборки
Сохраняйте и классифицируйте контент в соответствии со своими настройками.
Вы можете использовать ML Kit для определения языка текстовой строки. Вы можете получить наиболее вероятный язык строки, а также оценки достоверности для всех возможных языков строки.
ML Kit распознает текст на более чем 100 различных языках в их родном алфавите. Кроме того, латинизированный текст можно распознать на арабском, болгарском, китайском, греческом, хинди, японском и русском языках. См. полный список поддерживаемых языков и скриптов.
Попробуйте это
Поэкспериментируйте с примером приложения , чтобы увидеть пример использования этого API.
Прежде чем начать
Включите в свой подфайл следующие модули ML Kit:
pod 'GoogleMLKit/LanguageID', '8.0.0'
После установки или обновления модулей вашего проекта откройте проект Xcode, используя его .xcworkspace . ML Kit поддерживается в Xcode версии 12.4 или новее.
Определить язык строки
Чтобы определить язык строки, получите экземпляр LanguageIdentification , а затем передайте строку методу identifyLanguage(for:) .
Например:
Быстрый
letlanguageId=NaturalLanguage.languageIdentification()languageId.identifyLanguage(for:text){(languageCode,error)inifleterror=error{print("Failed with error: \(error)")return}ifletlanguageCode=languageCode,languageCode!="und"{print("Identified Language: \(languageCode)")}else{print("No language was identified")}}
Цель-C
MLKLanguageIdentification*languageId=[MLKLanguageIdentificationlanguageIdentification];[languageIdidentifyLanguageForText:textcompletion:^(NSString*_NullablelanguageCode,NSError*_Nullableerror){if(error!=nil){NSLog(@"Failed with error: %@",error.localizedDescription);return;}if(![languageCodeisEqualToString:@"und"]){NSLog(@"Identified Language: %@",languageCode);}else{NSLog(@"No language was identified");}}];
Если вызов успешен, обработчику завершения передается код языка BCP-47 , указывающий язык текста. Если ни один язык не может быть уверенно определен, передается код und (неопределенный).
По умолчанию ML Kit возвращает значение, отличное от und только если он идентифицирует язык со значением достоверности не менее 0,5. Вы можете изменить этот порог, передав объект LanguageIdentificationOptions в languageIdentification(options:) :
Чтобы получить значения достоверности наиболее вероятных языков строки, получите экземпляр LanguageIdentification , а затем передайте строку identifyPossibleLanguages(for:) .
Например:
Быстрый
letlanguageId=NaturalLanguage.languageIdentification()languageId.identifyPossibleLanguages(for:text){(identifiedLanguages,error)inifleterror=error{print("Failed with error: \(error)")return}guardletidentifiedLanguages=identifiedLanguages,!identifiedLanguages.isEmpty,identifiedLanguages[0].languageCode!="und"else{print("No language was identified")return}print("Identified Languages:\n"+identifiedLanguages.map{String(format:"(%@, %.2f)",$0.languageCode,$0.confidence)}.joined(separator:"\n"))}
Цель-C
MLKLanguageIdentification*languageId=[MLKLanguageIdentificationlanguageIdentification];[languageIdidentifyPossibleLanguagesForText:textcompletion:^(NSArray*_NonnullidentifiedLanguages,NSError*_Nullableerror){if(error!=nil){NSLog(@"Failed with error: %@",error.localizedDescription);return;}if(identifiedLanguages.count==1&&[identifiedLanguages[0].languageCodeisEqualToString:@"und"]){NSLog(@"No language was identified");return;}NSMutableString*outputText=[NSMutableStringstringWithFormat:@"Identified Languages:"];for(MLKIdentifiedLanguage*languageinidentifiedLanguages){[outputTextappendFormat:@"\n(%@, %.2f)",language.languageCode,language.confidence];}NSLog(outputText);}];
Если вызов успешен, список объектов IdentifiedLanguage передается обработчику продолжения. Из каждого объекта вы можете получить код языка BCP-47 и уверенность в том, что строка написана на этом языке. Обратите внимание, что эти значения указывают на уверенность в том, что вся строка написана на данном языке; ML Kit не идентифицирует несколько языков в одной строке.
По умолчанию ML Kit возвращает только языки со значениями достоверности не менее 0,01. Вы можете изменить этот порог, передав объект LanguageIdentificationOptions в languageIdentification(options:) :
[[["Прост для понимания","easyToUnderstand","thumb-up"],["Помог мне решить мою проблему","solvedMyProblem","thumb-up"],["Другое","otherUp","thumb-up"]],[["Отсутствует нужная мне информация","missingTheInformationINeed","thumb-down"],["Слишком сложен/слишком много шагов","tooComplicatedTooManySteps","thumb-down"],["Устарел","outOfDate","thumb-down"],["Проблема с переводом текста","translationIssue","thumb-down"],["Проблемы образцов/кода","samplesCodeIssue","thumb-down"],["Другое","otherDown","thumb-down"]],["Последнее обновление: 2025-09-04 UTC."],[[["\u003cp\u003eML Kit can identify the language of a string of text and provide confidence scores for all possible languages, supporting over 100 languages.\u003c/p\u003e\n"],["\u003cp\u003eYou can get the most likely language of a string using the \u003ccode\u003eidentifyLanguage(for:)\u003c/code\u003e method or get confidence values for possible languages using the \u003ccode\u003eidentifyPossibleLanguages(for:)\u003c/code\u003e method.\u003c/p\u003e\n"],["\u003cp\u003eAdjust the confidence threshold for language identification by passing a \u003ccode\u003eLanguageIdentificationOptions\u003c/code\u003e object to \u003ccode\u003elanguageIdentification(options:)\u003c/code\u003e.\u003c/p\u003e\n"],["\u003cp\u003eTo use ML Kit for language identification, include the \u003ccode\u003eGoogleMLKit/LanguageID\u003c/code\u003e pod in your Podfile and ensure your Xcode version is 12.4 or greater.\u003c/p\u003e\n"]]],["ML Kit can identify the language of text strings, supporting over 100 languages and romanized text for select languages. To use it, add the `GoogleMLKit/LanguageID` pod to your project. The `identifyLanguage(for:)` method returns the most likely language code, or \"und\" if undetermined. The method `identifyPossibleLanguages(for:)` provides confidence scores for multiple potential languages. Both methods allow setting a confidence threshold, with the default at 0.5 and 0.01 respectively.\n"],null,["You can use ML Kit to identify the language of a string of text. You can\nget the string's most likely language as well as confidence scores for all of the\nstring's possible languages.\n\nML Kit recognizes text in more than 100 different languages in their native scripts.\nIn addition, romanized text can be recognized for Arabic, Bulgarian, Chinese,\nGreek, Hindi, Japanese, and Russian. See the\n[complete list](/ml-kit/language/identification/langid-support) of supported languages and scripts.\n\n\u003cbr /\u003e\n\n| **Note:** ML Kit iOS APIs only run on 64-bit devices. If you build your app with 32-bit support, check the device's architecture before using this API.\n\nTry it out\n\n- Play around with [the sample app](https://github.com/googlesamples/mlkit/tree/master/ios/quickstarts/languageid) to see an example usage of this API.\n\nBefore you begin\n\n1. Include the following ML Kit pods in your Podfile: \n\n ```\n pod 'GoogleMLKit/LanguageID', '8.0.0'\n ```\n2. After you install or update your project's Pods, open your Xcode project using its `.xcworkspace`. ML Kit is supported in Xcode version 12.4 or greater.\n\nIdentify the language of a string\n\nTo identify the language of a string, get an instance of\n`LanguageIdentification`, and then pass the string to the\n`identifyLanguage(for:)` method.\n\nFor example: \n\nSwift \n\n```swift\nlet languageId = NaturalLanguage.languageIdentification()\n\nlanguageId.identifyLanguage(for: text) { (languageCode, error) in\n if let error = error {\n print(\"Failed with error: \\(error)\")\n return\n }\n if let languageCode = languageCode, languageCode != \"und\" {\n print(\"Identified Language: \\(languageCode)\")\n } else {\n print(\"No language was identified\")\n }\n}\n```\n\nObjective-C \n\n```objective-c\nMLKLanguageIdentification *languageId = [MLKLanguageIdentification languageIdentification];\n\n[languageId identifyLanguageForText:text\n completion:^(NSString * _Nullable languageCode,\n NSError * _Nullable error) {\n if (error != nil) {\n NSLog(@\"Failed with error: %@\", error.localizedDescription);\n return;\n }\n if (![languageCode isEqualToString:@\"und\"] ) {\n NSLog(@\"Identified Language: %@\", languageCode);\n } else {\n NSLog(@\"No language was identified\");\n }\n }];\n```\n\nIf the call succeeds, a\n[BCP-47 language code](//en.wikipedia.org/wiki/IETF_language_tag) is\npassed to the completion handler, indicating the language of the text. If no\nlanguage could be confidently detected, the code `und` (undetermined) is passed.\n\nBy default, ML Kit returns a non-`und` value only when it identifies the\nlanguage with a confidence value of at least 0.5. You can change this threshold\nby passing a `LanguageIdentificationOptions` object to\n`languageIdentification(options:)`: \n\nSwift \n\n```swift\nlet options = LanguageIdentificationOptions(confidenceThreshold: 0.4)\nlet languageId = NaturalLanguage.languageIdentification(options: options)\n```\n\nObjective-C \n\n```objective-c\nMLKLanguageIdentificationOptions *options =\n [[MLKLanguageIdentificationOptions alloc] initWithConfidenceThreshold:0.4];\nMLKLanguageIdentification *languageId =\n [MLKLanguageIdentification languageIdentificationWithOptions:options];\n```\n\nGet the possible languages of a string\n\nTo get the confidence values of a string's most likely languages, get an\ninstance of `LanguageIdentification` and then pass the string to the\n`identifyPossibleLanguages(for:)` method.\n\nFor example: \n\nSwift \n\n```swift\nlet languageId = NaturalLanguage.languageIdentification()\n\nlanguageId.identifyPossibleLanguages(for: text) { (identifiedLanguages, error) in\n if let error = error {\n print(\"Failed with error: \\(error)\")\n return\n }\n guard let identifiedLanguages = identifiedLanguages,\n !identifiedLanguages.isEmpty,\n identifiedLanguages[0].languageCode != \"und\"\n else {\n print(\"No language was identified\")\n return\n }\n\n print(\"Identified Languages:\\n\" +\n identifiedLanguages.map {\n String(format: \"(%@, %.2f)\", $0.languageCode, $0.confidence)\n }.joined(separator: \"\\n\"))\n}\n```\n\nObjective-C \n\n```objective-c\nMLKLanguageIdentification *languageId = [MLKLanguageIdentification languageIdentification];\n\n[languageId identifyPossibleLanguagesForText:text\n completion:^(NSArray * _Nonnull identifiedLanguages,\n NSError * _Nullable error) {\n if (error != nil) {\n NSLog(@\"Failed with error: %@\", error.localizedDescription);\n return;\n }\n if (identifiedLanguages.count == 1\n && [identifiedLanguages[0].languageCode isEqualToString:@\"und\"] ) {\n NSLog(@\"No language was identified\");\n return;\n }\n NSMutableString *outputText = [NSMutableString stringWithFormat:@\"Identified Languages:\"];\n for (MLKIdentifiedLanguage *language in identifiedLanguages) {\n [outputText appendFormat:@\"\\n(%@, %.2f)\", language.languageCode, language.confidence];\n }\n NSLog(outputText);\n}];\n```\n\nIf the call succeeds, a list of `IdentifiedLanguage` objects is passed to the\ncontinuation handler. From each object, you can get the language's BCP-47 code\nand the confidence that the string is in that language. Note that\nthese values indicate the confidence that the entire string is in the given\nlanguage; ML Kit doesn't identify multiple languages in a single string.\n\nBy default, ML Kit returns only languages with confidence values of at least\n0.01. You can change this threshold by passing a\n`LanguageIdentificationOptions` object to `languageIdentification(options:)`: \n\nSwift \n\n```swift\nlet options = LanguageIdentificationOptions(confidenceThreshold: 0.4)\nlet languageId = NaturalLanguage.languageIdentification(options: options)\n```\n\nObjective-C \n\n```objective-c\nMLKLanguageIdentificationOptions *options =\n [[MLKLanguageIdentificationOptions alloc] initWithConfidenceThreshold:0.4];\nMLKLanguageIdentification *languageId =\n [MLKLanguageIdentification languageIdentificationWithOptions:options];\n```\n\nIf no language meets this threshold, the list has one item, with the value\n`und`."]]