Identificar o idioma do texto com o Kit de ML no iOS
Mantenha tudo organizado com as coleções
Salve e categorize o conteúdo com base nas suas preferências.
É possível usar o Kit de ML para identificar o idioma de uma string de texto. Você pode
obter o idioma mais provável da string, bem como pontuações de confiança para todas as
os idiomas possíveis da string.
O Kit de ML reconhece texto em mais de 100 idiomas diferentes nos scripts nativos.
Além disso, o texto romanizado pode ser reconhecido em árabe, búlgaro, chinês,
grego, hindi, japonês e russo. Consulte a
lista completa de linguagens e scripts compatíveis.
Faça um teste
Teste o app de exemplo para
um exemplo de uso dessa API.
Antes de começar
Inclua os seguintes pods do kit de ML no seu Podfile:
pod 'GoogleMLKit/LanguageID', '8.0.0'
Depois de instalar ou atualizar os pods do seu projeto, abra o projeto Xcode usando o
.xcworkspace: O Kit de ML é compatível com a versão 12.4 ou mais recente do Xcode.
Identificar o idioma de uma string
Para identificar o idioma de uma string, pegue uma instância de
LanguageIdentification e, em seguida, transmita a string ao
identifyLanguage(for:).
Exemplo:
Swift
letlanguageId=NaturalLanguage.languageIdentification()languageId.identifyLanguage(for:text){(languageCode,error)inifleterror=error{print("Failed with error: \(error)")return}ifletlanguageCode=languageCode,languageCode!="und"{print("Identified Language: \(languageCode)")}else{print("No language was identified")}}
Objective-C
MLKLanguageIdentification*languageId=[MLKLanguageIdentificationlanguageIdentification];[languageIdidentifyLanguageForText:textcompletion:^(NSString*_NullablelanguageCode,NSError*_Nullableerror){if(error!=nil){NSLog(@"Failed with error: %@",error.localizedDescription);return;}if(![languageCodeisEqualToString:@"und"]){NSLog(@"Identified Language: %@",languageCode);}else{NSLog(@"No language was identified");}}];
Se a chamada for bem-sucedida, uma
O código de idioma BCP-47 é
passado para o gerenciador de conclusão, indicando o idioma do texto. Em caso negativo
idioma puder ser detectado com confiança, o código und (indeterminado) será transmitido.
Por padrão, o Kit de ML retorna um valor diferente de und somente quando identifica o
linguagem natural com um nível de confiança de pelo menos 0,5. É possível alterar esse limite
transmitindo um objeto LanguageIdentificationOptions para
languageIdentification(options:):
Para obter os valores de confiança dos idiomas mais prováveis de uma string, adquira uma
instância de LanguageIdentification e, em seguida, transmita a string para o
identifyPossibleLanguages(for:).
Exemplo:
Swift
letlanguageId=NaturalLanguage.languageIdentification()languageId.identifyPossibleLanguages(for:text){(identifiedLanguages,error)inifleterror=error{print("Failed with error: \(error)")return}guardletidentifiedLanguages=identifiedLanguages,!identifiedLanguages.isEmpty,identifiedLanguages[0].languageCode!="und"else{print("No language was identified")return}print("Identified Languages:\n"+identifiedLanguages.map{String(format:"(%@, %.2f)",$0.languageCode,$0.confidence)}.joined(separator:"\n"))}
Objective-C
MLKLanguageIdentification*languageId=[MLKLanguageIdentificationlanguageIdentification];[languageIdidentifyPossibleLanguagesForText:textcompletion:^(NSArray*_NonnullidentifiedLanguages,NSError*_Nullableerror){if(error!=nil){NSLog(@"Failed with error: %@",error.localizedDescription);return;}if(identifiedLanguages.count==1&&[identifiedLanguages[0].languageCodeisEqualToString:@"und"]){NSLog(@"No language was identified");return;}NSMutableString*outputText=[NSMutableStringstringWithFormat:@"Identified Languages:"];for(MLKIdentifiedLanguage*languageinidentifiedLanguages){[outputTextappendFormat:@"\n(%@, %.2f)",language.languageCode,language.confidence];}NSLog(outputText);}];
Se a chamada for bem-sucedida, uma lista de objetos IdentifiedLanguage será transmitida à
gerenciador de continuação. É possível receber o código de idioma BCP-47 de cada objeto
e o nível de confiança de que a string está no idioma em questão. Observe que
esses valores indicam a confiança de que toda a string no objeto
idioma; O Kit de ML não identifica vários idiomas em uma única string.
Por padrão, o Kit de ML retorna apenas idiomas com valores de confiança de pelo menos
0,01. Você pode alterar esse limite passando um
Objeto LanguageIdentificationOptions para languageIdentification(options:):
[[["Fácil de entender","easyToUnderstand","thumb-up"],["Meu problema foi resolvido","solvedMyProblem","thumb-up"],["Outro","otherUp","thumb-up"]],[["Não contém as informações de que eu preciso","missingTheInformationINeed","thumb-down"],["Muito complicado / etapas demais","tooComplicatedTooManySteps","thumb-down"],["Desatualizado","outOfDate","thumb-down"],["Problema na tradução","translationIssue","thumb-down"],["Problema com as amostras / o código","samplesCodeIssue","thumb-down"],["Outro","otherDown","thumb-down"]],["Última atualização 2025-09-04 UTC."],[[["\u003cp\u003eML Kit can identify the language of a string of text and provide confidence scores for all possible languages, supporting over 100 languages.\u003c/p\u003e\n"],["\u003cp\u003eYou can get the most likely language of a string using the \u003ccode\u003eidentifyLanguage(for:)\u003c/code\u003e method or get confidence values for possible languages using the \u003ccode\u003eidentifyPossibleLanguages(for:)\u003c/code\u003e method.\u003c/p\u003e\n"],["\u003cp\u003eAdjust the confidence threshold for language identification by passing a \u003ccode\u003eLanguageIdentificationOptions\u003c/code\u003e object to \u003ccode\u003elanguageIdentification(options:)\u003c/code\u003e.\u003c/p\u003e\n"],["\u003cp\u003eTo use ML Kit for language identification, include the \u003ccode\u003eGoogleMLKit/LanguageID\u003c/code\u003e pod in your Podfile and ensure your Xcode version is 12.4 or greater.\u003c/p\u003e\n"]]],["ML Kit can identify the language of text strings, supporting over 100 languages and romanized text for select languages. To use it, add the `GoogleMLKit/LanguageID` pod to your project. The `identifyLanguage(for:)` method returns the most likely language code, or \"und\" if undetermined. The method `identifyPossibleLanguages(for:)` provides confidence scores for multiple potential languages. Both methods allow setting a confidence threshold, with the default at 0.5 and 0.01 respectively.\n"],null,["You can use ML Kit to identify the language of a string of text. You can\nget the string's most likely language as well as confidence scores for all of the\nstring's possible languages.\n\nML Kit recognizes text in more than 100 different languages in their native scripts.\nIn addition, romanized text can be recognized for Arabic, Bulgarian, Chinese,\nGreek, Hindi, Japanese, and Russian. See the\n[complete list](/ml-kit/language/identification/langid-support) of supported languages and scripts.\n\n\u003cbr /\u003e\n\n| **Note:** ML Kit iOS APIs only run on 64-bit devices. If you build your app with 32-bit support, check the device's architecture before using this API.\n\nTry it out\n\n- Play around with [the sample app](https://github.com/googlesamples/mlkit/tree/master/ios/quickstarts/languageid) to see an example usage of this API.\n\nBefore you begin\n\n1. Include the following ML Kit pods in your Podfile: \n\n ```\n pod 'GoogleMLKit/LanguageID', '8.0.0'\n ```\n2. After you install or update your project's Pods, open your Xcode project using its `.xcworkspace`. ML Kit is supported in Xcode version 12.4 or greater.\n\nIdentify the language of a string\n\nTo identify the language of a string, get an instance of\n`LanguageIdentification`, and then pass the string to the\n`identifyLanguage(for:)` method.\n\nFor example: \n\nSwift \n\n```swift\nlet languageId = NaturalLanguage.languageIdentification()\n\nlanguageId.identifyLanguage(for: text) { (languageCode, error) in\n if let error = error {\n print(\"Failed with error: \\(error)\")\n return\n }\n if let languageCode = languageCode, languageCode != \"und\" {\n print(\"Identified Language: \\(languageCode)\")\n } else {\n print(\"No language was identified\")\n }\n}\n```\n\nObjective-C \n\n```objective-c\nMLKLanguageIdentification *languageId = [MLKLanguageIdentification languageIdentification];\n\n[languageId identifyLanguageForText:text\n completion:^(NSString * _Nullable languageCode,\n NSError * _Nullable error) {\n if (error != nil) {\n NSLog(@\"Failed with error: %@\", error.localizedDescription);\n return;\n }\n if (![languageCode isEqualToString:@\"und\"] ) {\n NSLog(@\"Identified Language: %@\", languageCode);\n } else {\n NSLog(@\"No language was identified\");\n }\n }];\n```\n\nIf the call succeeds, a\n[BCP-47 language code](//en.wikipedia.org/wiki/IETF_language_tag) is\npassed to the completion handler, indicating the language of the text. If no\nlanguage could be confidently detected, the code `und` (undetermined) is passed.\n\nBy default, ML Kit returns a non-`und` value only when it identifies the\nlanguage with a confidence value of at least 0.5. You can change this threshold\nby passing a `LanguageIdentificationOptions` object to\n`languageIdentification(options:)`: \n\nSwift \n\n```swift\nlet options = LanguageIdentificationOptions(confidenceThreshold: 0.4)\nlet languageId = NaturalLanguage.languageIdentification(options: options)\n```\n\nObjective-C \n\n```objective-c\nMLKLanguageIdentificationOptions *options =\n [[MLKLanguageIdentificationOptions alloc] initWithConfidenceThreshold:0.4];\nMLKLanguageIdentification *languageId =\n [MLKLanguageIdentification languageIdentificationWithOptions:options];\n```\n\nGet the possible languages of a string\n\nTo get the confidence values of a string's most likely languages, get an\ninstance of `LanguageIdentification` and then pass the string to the\n`identifyPossibleLanguages(for:)` method.\n\nFor example: \n\nSwift \n\n```swift\nlet languageId = NaturalLanguage.languageIdentification()\n\nlanguageId.identifyPossibleLanguages(for: text) { (identifiedLanguages, error) in\n if let error = error {\n print(\"Failed with error: \\(error)\")\n return\n }\n guard let identifiedLanguages = identifiedLanguages,\n !identifiedLanguages.isEmpty,\n identifiedLanguages[0].languageCode != \"und\"\n else {\n print(\"No language was identified\")\n return\n }\n\n print(\"Identified Languages:\\n\" +\n identifiedLanguages.map {\n String(format: \"(%@, %.2f)\", $0.languageCode, $0.confidence)\n }.joined(separator: \"\\n\"))\n}\n```\n\nObjective-C \n\n```objective-c\nMLKLanguageIdentification *languageId = [MLKLanguageIdentification languageIdentification];\n\n[languageId identifyPossibleLanguagesForText:text\n completion:^(NSArray * _Nonnull identifiedLanguages,\n NSError * _Nullable error) {\n if (error != nil) {\n NSLog(@\"Failed with error: %@\", error.localizedDescription);\n return;\n }\n if (identifiedLanguages.count == 1\n && [identifiedLanguages[0].languageCode isEqualToString:@\"und\"] ) {\n NSLog(@\"No language was identified\");\n return;\n }\n NSMutableString *outputText = [NSMutableString stringWithFormat:@\"Identified Languages:\"];\n for (MLKIdentifiedLanguage *language in identifiedLanguages) {\n [outputText appendFormat:@\"\\n(%@, %.2f)\", language.languageCode, language.confidence];\n }\n NSLog(outputText);\n}];\n```\n\nIf the call succeeds, a list of `IdentifiedLanguage` objects is passed to the\ncontinuation handler. From each object, you can get the language's BCP-47 code\nand the confidence that the string is in that language. Note that\nthese values indicate the confidence that the entire string is in the given\nlanguage; ML Kit doesn't identify multiple languages in a single string.\n\nBy default, ML Kit returns only languages with confidence values of at least\n0.01. You can change this threshold by passing a\n`LanguageIdentificationOptions` object to `languageIdentification(options:)`: \n\nSwift \n\n```swift\nlet options = LanguageIdentificationOptions(confidenceThreshold: 0.4)\nlet languageId = NaturalLanguage.languageIdentification(options: options)\n```\n\nObjective-C \n\n```objective-c\nMLKLanguageIdentificationOptions *options =\n [[MLKLanguageIdentificationOptions alloc] initWithConfidenceThreshold:0.4];\nMLKLanguageIdentification *languageId =\n [MLKLanguageIdentification languageIdentificationWithOptions:options];\n```\n\nIf no language meets this threshold, the list has one item, with the value\n`und`."]]