• Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
    Dedicated community for Japanese speakers
  • 한국 커뮤니티
    Dedicated community for Korean speakers
Exit
0

How can I find names in text?

Explorer ,
Jan 10, 2017 Jan 10, 2017

Copy link to clipboard

Copied

Hi, there.

We have a text document in which we want to find all (Biblical) names.

Ideas that we had were:

1. Search for all words that begin with a capitol letter.

     PROBLEM: It will also capture the beginning of all sentences [etc.]. Very tedious to weed out all the extras.

2. Search for all words that are not in the dictionary.

     QUESTION: Is that possible?

Any help would be appreciated.

SK

TOPICS
Scripting

Views

609

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines

correct answers 1 Correct answer

Guru , Jan 10, 2017 Jan 10, 2017

You can probably find online list of both anglicized and Hebrew transliterations of Names the will be a good start.

You also will have to sort out Adams from Adam etc. 

Votes

Translate

Translate
New Here ,
Jan 10, 2017 Jan 10, 2017

Copy link to clipboard

Copied

I'd say this is pretty hard to do. If you wanted to find them in indesign, probably your best bet would be to include a dictionary of accepted names which you could presumably find somewhere and then pair them regex(es) to find what you wanted in the document. This is hard and probably more trouble than its worth (depending on how critical this is). A better option might be to export the text and find some sort of third party ai tool that can do this for you. I highly doubt you want to write your own artificial intelligence thing.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Jan 10, 2017 Jan 10, 2017

Copy link to clipboard

Copied

Can I script a grep/text search for all "misspelled" words?

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Guru ,
Jan 10, 2017 Jan 10, 2017

Copy link to clipboard

Copied

You can check out www.mindsteam.com/products/mindspellpro/index.html I don't know if he's got for later versions of InDesign but the plugin allow for scripted access of spelling errors.

I think that as Obi wrote it might be a good idea to export a list to an external (file) for manual editing.

Note

not every Capitalized spell error is going to be a Name.

not every name going to be a spelling error. Adam?

not every first word of the sentence not going to be a name.

You might want to make separate lists

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Guru ,
Jan 10, 2017 Jan 10, 2017

Copy link to clipboard

Copied

LATEST

You can probably find online list of both anglicized and Hebrew transliterations of Names the will be a good start.

You also will have to sort out Adams from Adam etc. 

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Guide ,
Jan 10, 2017 Jan 10, 2017

Copy link to clipboard

Copied

Many IndexMatic users have had to deal with proper names searching and/or indexing—see for example here, Indiscripts :: IndexMatic 2 | Frequently Asked Questions [UPDATE] and my suggested approach (in French, sorry :-/) here: Indiscripts :: IndexMatic | Stratégie d'indexation des noms propres

In all cases the key rule is to use and gradually refine a dedicated word list (which in iX becomes a “query list”) and then to run it as a regular expression across the document. Of course I don't pretend IndexMatic is what you need to achieve your task—Peter Kahrel, Id-Extras and many of my colleagues here have developed great scripts and utilities that might do the job as well. My point is just that you likely have to use, or implement, a finely regex-based script.

@+

Marc

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Jan 10, 2017 Jan 10, 2017

Copy link to clipboard

Copied

Hi,

About proper names, I'ld extract all of them in a new file, sort them (1 minute to do it) and read the list, deleting what we don't want (1 minute more because I read very quickly!  =D It's a joke!).

After that, what do you want to do with this list?

(^/)

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines