ArctosDB/arctos

Do we need agent guardrails? (was: funky agents)

Closed this issue · 84 comments

Do we need rules or guidance around agents?

I've noticed some not-great agent data being created, I don't know if @ArctosDB/agents-committee would care to attempt to establish any guardrails or if this is fine or ?? Please advise, or close if nobody cares.

Possible Actions

Examples and ponderings and such follow


Here are agents with nonunique preferred name:


select
    agent_id,
    agent_type,
    preferred_agent_name,
    getpreferredagentname(created_by_agent_id) creator,
    created_date
from 
    agent
where
    preferred_agent_name in (select preferred_agent_name from agent group by preferred_agent_name having count(*) > 1)
order by preferred_agent_name,created_date desc;


 agent_id |  agent_type  | preferred_agent_name |         creator          |        created_date        
----------+--------------+----------------------+--------------------------+----------------------------
 21352027 | person       | Allison Nelson       | Jonathan L. Dunnum       | 2024-03-25 08:55:51.725221
 21350942 | person       | Allison Nelson       | Katherine L. Anderson    | 2024-01-31 14:53:50.530417
 21301738 | person       | Ben D. Marks         | Charles M. Dardia        | 2016-06-28 13:32:51
 21248037 | person       | Ben D. Marks         | unknown                  | 2013-12-16 21:49:31
 21351392 | person       | Bruce B. Paige       | Derek S. Sikes           | 2024-03-05 13:38:17.074186
 21348083 | person       | Bruce B. Paige       | Teresa J. Mayfield-Meyer | 2023-04-17 16:58:23.684949
 21351197 | person       | David Johnson        | Derek S. Sikes           | 2024-02-26 19:05:23.736418
 21295057 | person       | David Johnson        | Dusty L. McDonald        | 2015-10-06 11:30:13
 21352097 | organization | DOI Foundation       | C. O. Webb               | 2024-04-03 20:28:58.545203
 21348956 | organization | DOI Foundation       | Dusty L. McDonald        | 2023-06-29 08:16:07.045192
 21351450 | person       | G. S. Tulloch        | Derek S. Sikes           | 2024-03-05 13:38:17.997551
 21351074 | person       | G. S. Tulloch        | Jayce Williamson         | 2024-02-12 13:45:24.589931
 21351805 | person       | Jared Hughey         | Derek S. Sikes           | 2024-03-05 13:38:29.414319
 21348137 | person       | Jared Hughey         | Justin Fulkerson         | 2023-04-21 18:07:15.093075
 21351445 | person       | J. Jacobs            | Derek S. Sikes           | 2024-03-05 13:38:17.907213
 21350919 | person       | J. Jacobs            | Jessica Weller           | 2024-01-27 11:25:38.328081
 21351651 | person       | Laura Lofgren        | Derek S. Sikes           | 2024-03-05 13:38:26.615875
 21349012 | person       | Laura Lofgren        | Jayce Williamson         | 2023-07-16 14:23:47.257662
 21333621 | person       | Lauren Wilson        | Zack Perry               | 2021-07-13 12:39:08.99178
 21300714 | person       | Lauren Wilson        | Erica Krimmel            | 2016-04-12 12:17:17
 21351943 | person       | Mary Ann Sundown     | Angela Linn              | 2024-03-09 20:57:52.329768
 21347114 | person       | Mary Ann Sundown     | Shealyn Golden           | 2023-01-27 02:25:06.394785
 21351324 | person       | R. Leiner            | Derek S. Sikes           | 2024-03-05 13:37:47.64515
 21349050 | person       | R. Leiner            | Jayce Williamson         | 2023-07-26 15:59:10.979065

and recent person-agents - many of which are clearly not persons - without a first or last name:

select
    agent.agent_id,
    preferred_agent_name,
    getpreferredagentname(agent.created_by_agent_id) creator,
    agent.created_date
from 
    agent
    left outer join agent_attribute on agent.agent_id=agent_attribute.agent_id and agent_attribute.attribute_type in ('first name','last name')
where
    agent.agent_type='person' and
    agent_attribute.attribute_id is null 
    and agent.created_date > current_date - interval '1 year' -- remove this line for all, its too much to paste here
order by agent.created_date desc
;
 agent_id |                  preferred_agent_name                  |     creator     |        created_date        
----------+--------------------------------------------------------+-----------------+----------------------------
 21352122 | Jack Spratt                                            | Jozef A. Slowik | 2024-04-05 12:12:44.141425
 21352119 | C. Stillman                                            | Jozef A. Slowik | 2024-04-04 16:05:10.937667
 21352117 | B. S. Blitz                                            | Jozef A. Slowik | 2024-04-04 15:34:50.688057
 21352116 | F. Sorensen                                            | Jozef A. Slowik | 2024-04-04 15:25:19.630444
 21352109 | M. Rosy                                                | Jozef A. Slowik | 2024-04-04 12:59:13.243463
 21352057 | unrecorded                                             | Angela Linn     | 2024-03-26 16:19:24.093752
 21351875 | Bundtzen                                               | Derek S. Sikes  | 2024-03-05 13:38:39.295728
 21351874 | Sid                                                    | Derek S. Sikes  | 2024-03-05 13:38:39.285538
 21351873 | Kenai Veterinary Clinic                                | Derek S. Sikes  | 2024-03-05 13:38:39.275253
 21351872 | Schmidt                                                | Derek S. Sikes  | 2024-03-05 13:38:39.262718
 21351870 | Chester                                                | Derek S. Sikes  | 2024-03-05 13:38:39.226274
 21351818 | Snarski                                                | Derek S. Sikes  | 2024-03-05 13:38:29.660891
 21351809 | Lucas                                                  | Derek S. Sikes  | 2024-03-05 13:38:29.487845
 21351798 | Buck                                                   | Derek S. Sikes  | 2024-03-05 13:38:29.300602
 21351784 | Galena Butterfly Festival Participants                 | Derek S. Sikes  | 2024-03-05 13:38:29.048438
 21351782 | Dick H. Bishop                                         | Derek S. Sikes  | 2024-03-05 13:38:29.023773
 21351746 | Femaida                                                | Derek S. Sikes  | 2024-03-05 13:38:28.354653
 21351744 | Challet                                                | Derek S. Sikes  | 2024-03-05 13:38:28.329927
 21351719 | Southside Animal Hospital                              | Derek S. Sikes  | 2024-03-05 13:38:27.869955
 21351701 | Gwichin Renewable Resources                            | Derek S. Sikes  | 2024-03-05 13:38:27.544315
 21351685 | v. Doesburg                                            | Derek S. Sikes  | 2024-03-05 13:38:27.270269
 21351611 | et al.                                                 | Derek S. Sikes  | 2024-03-05 13:38:25.915341
 21351598 | Sjodin                                                 | Derek S. Sikes  | 2024-03-05 13:38:25.657055
 21351592 | L. Shults                                              | Derek S. Sikes  | 2024-03-05 13:38:20.591105
 21351570 | Schuh & Gray                                           | Derek S. Sikes  | 2024-03-05 13:38:20.233703
 21351546 | R. Latta                                               | Derek S. Sikes  | 2024-03-05 13:38:19.846565
 21351543 | S. Craig                                               | Derek S. Sikes  | 2024-03-05 13:38:19.80164
 21351542 | Bio 116 students                                       | Derek S. Sikes  | 2024-03-05 13:38:19.790886
 21351511 | Waterways Vet Clinic                                   | Derek S. Sikes  | 2024-03-05 13:38:19.29779
 21351485 | Fran & Pete                                            | Derek S. Sikes  | 2024-03-05 13:38:18.724329
 21351479 | Smithhisher                                            | Derek S. Sikes  | 2024-03-05 13:38:18.612474
 21351478 | Unalakleet School students                             | Derek S. Sikes  | 2024-03-05 13:38:18.601517
 21351472 | Christian F. Weisser                                   | Derek S. Sikes  | 2024-03-05 13:38:18.499276
 21351451 | D. A. P.                                               | Derek S. Sikes  | 2024-03-05 13:38:18.021055
 21351438 | Stoneman                                               | Derek S. Sikes  | 2024-03-05 13:38:17.796991
 21351435 | Chriska Derr                                           | Derek S. Sikes  | 2024-03-05 13:38:17.748558
 21351434 | IAS                                                    | Derek S. Sikes  | 2024-03-05 13:38:17.738767
 21351433 | Lehman                                                 | Derek S. Sikes  | 2024-03-05 13:38:17.721695
 21351430 | Pam's pet grooming                                     | Derek S. Sikes  | 2024-03-05 13:38:17.680503
 21351418 | Calkins                                                | Derek S. Sikes  | 2024-03-05 13:38:17.505252
 21351396 | College Village Animal Clinic                          | Derek S. Sikes  | 2024-03-05 13:38:17.138685
 21351393 | Southeast Alaska Animal Medical Center                 | Derek S. Sikes  | 2024-03-05 13:38:17.094364
 21351381 | Roberts                                                | Derek S. Sikes  | 2024-03-05 13:38:16.894943
 21351347 | Stream Ecology Class UAF                               | Derek S. Sikes  | 2024-03-05 13:38:16.3071
 21351345 | Plant Protection Division Ministry of Agriculture USSR | Derek S. Sikes  | 2024-03-05 13:38:16.274119
 21351302 | Taiga                                                  | Derek S. Sikes  | 2024-03-05 13:37:47.281312
 21351297 | DHS                                                    | Derek S. Sikes  | 2024-03-05 13:37:47.19272
 21351294 | Wasilla Veterinary Clinic                              | Derek S. Sikes  | 2024-03-05 13:37:47.138723
 21351282 | McCarthy                                               | Derek S. Sikes  | 2024-03-05 13:34:13.480942


A bunch of these are @DerekSikes is there a reason the existing agents aren't being found?

If you mean the nonunique names, that's probably me recovering verbatims that had been re-created post-verbatimization (and blaming Derek for it!). Likely one of them should be marked a duplicate of the other, but there's not enough information for me (nor anyone else, I suspect....) to actually determine that.

Going forward, I am going to favor Snarski over unknown.... :-)

While I'm here can someone explain the difference between 'create person' and 'create agent'?
Apparently use of Arctos for over 10 years hasn't gotten me to an advanced enough knowledge level to know this :)

If one asserts that two agents are the same (one is a bad duplicate) I assume all or any metadata for the bad duplicate is lost? Is this true?

I can see where two agent records are duplicates but both have different metadata and a 'merging' of metadata would be helpful.

Pretty sure 'et al.' is a mistake - this is often used on labels without enough space to write all the agents involved so indicates "and others" but would only be used in conjunction with an actual agent. eg.
https://arctos.database.museum/guid/UAM:Ento:275837
has
collector
Amy M. Runck
et al.

but instead of 2 collectors with one being 'et al.'
there should be only Amy M. Runck et al. as a single string?

Or maybe we should keep the et al. agent so we don't need to have a separate agent for every real person who might end up in an et al. situation?

difference

https://arctos.database.museum/info/ctDocumentation.cfm?table=ctagent_type

(And maybe that's all silly - table person is decades-gone, the rules which replaced it are now gone, the philosophy which required the rules may be gone - but I'm clearly having a hard time getting over that, hence this issue! - maybe type just isn't important and we're pointlessly making things difficult for ourselves, IDK.)

assume

https://arctos.database.museum/info/ctDocumentation.cfm?table=ctagent_attribute_type#bad_duplicate_of

Nothing's lost, nothing's automated, users just get a notification suggesting an update.

'merging' of metadata would be helpful

Absolutely - and that's a job for a human.

mistake

You can make it a bad duplicate of something, then use....

Screenshot 2024-04-09 at 04 53 23

... to update any records.

separate agent for every real person who might end up in an et al. situation?

Definitely not something I'd be happy to see, but I'm not hearing a lot of guidance from @ArctosDB/agents-committee at the moment....

To expand slightly on that and wander even farther from the topic at hand, I'm not happy to see "sorry, you're not very important to us" in ANY form, even when it's not me personally mushed into the et al. or faceless agency (see the host's attribute determiner) or whatever. Maybe I'm just being twitchy - I don't have field notebooks, we were recording data directly on AF sheets, the "traditional" importance of collector is perhaps less-relevant here - but I still think proper attribution is in general about the second-most important thing that museums can do (the first being solid links between literature and material).

Thanks. And for the record... our best practices protocol is to record all collectors in Arctos, but then for our tiny insect labels we often abbreviate to et al. The problem arises when someone makes labels BEFORE the data are captured and all we have are the tiny labels to enter the data from (retroactive data capture) - which is 99%+ of what most entomology collections are dealing with. UAM Entomology is a rare exception in that most of our data capture is prospective (before labeling).

More fun, there are a lot of agents being created with various could-be-important data stuffed into remarks.

We need a webinar, or some sort of education campaign??


select
    agent.agent_id,
    preferred_agent_name,
    getpreferredagentname(agent.created_by_agent_id) creator,
    agent.created_date,
    rem.attribute_value
from 
    agent
    inner join agent_attribute rem on agent.agent_id=rem.agent_id and rem.attribute_type in ('remarks')
    left outer join agent_attribute on agent.agent_id=agent_attribute.agent_id and agent_attribute.attribute_type in (
        select attribute_type from ctagent_attribute_type where purpose in ('address','event')
    )
where
    agent.agent_type='person' and
    agent_attribute.attribute_id is null and
     agent.created_date > current_date - interval '3 months'
order by agent.created_date desc
;

 agent_id |    preferred_agent_name    |        creator         |        created_date        |                                                  attribute_value                                                  
----------+----------------------------+------------------------+----------------------------+-------------------------------------------------------------------------------------------------------------------
 21352220 | L. Mironova                | Elena Taboko Taku      | 2024-04-12 11:50:41.997135 | Collected plants in Russia in 1986.
 21352218 | A. Nikulin                 | Elena Taboko Taku      | 2024-04-12 11:44:36.500603 | Collected plants in Russia in 1987.
 21352217 | V. Zemtsov                 | Elena Taboko Taku      | 2024-04-12 11:37:40.970177 | Collected plants in Russia in 1987.
 21352216 | V. F. Ezrailson            | Elena Taboko Taku      | 2024-04-12 11:32:48.073711 | Collected plants in Russia in 1966
 21352215 | I. Belskaya                | Elena Taboko Taku      | 2024-04-12 11:25:12.524505 | Collected plants in Russia in 1979.
 21352214 | T. Ishkova                 | Elena Taboko Taku      | 2024-04-12 11:19:29.117051 | Collected plants in Russia in 1983.
 21352213 | O. Babarykina              | Elena Taboko Taku      | 2024-04-12 11:13:00.603817 | Collected plants in Russia in 1984.
 21352212 | G. Liventseva              | Elena Taboko Taku      | 2024-04-12 11:09:19.76335  | Collected plants in Russia in 1984.
 21352211 | I. Pshenichnaya            | Elena Taboko Taku      | 2024-04-12 11:02:45.327919 | Collected plants in Russia in 1985.
 21352210 | G. D. Dymina               | Elena Taboko Taku      | 2024-04-12 10:54:17.99777  | Collected plants in Russia in 1979.
 21352209 | O. Zhdanova                | Elena Taboko Taku      | 2024-04-12 10:52:07.066693 | Collected plants in Russia in 1986.
 21352204 | O. Feronova                | Elena Taboko Taku      | 2024-04-11 18:26:41.110518 | Collected plants in Russia in 1979
 21352202 | V. Rozhitsina              | Elena Taboko Taku      | 2024-04-11 17:59:02.128203 | Collected plants in Russia in 1978
 21352177 | O. Babarykina              | Elena Taboko Taku      | 2024-04-11 00:23:58.065582 | Collected plants in Russia in 1984
 21352176 | I. Pshenichnaya            | Elena Taboko Taku      | 2024-04-11 00:19:57.708599 | Collected plants in Russia in 1984
 21352175 | L. Dyukova                 | Elena Taboko Taku      | 2024-04-11 00:15:46.461338 | Collected plants in Russia in 1985
 21352174 | L. Mironova                | Elena Taboko Taku      | 2024-04-11 00:09:32.275383 | Collected plants in Russia in 1985
 21352168 | A. Vershinin               | Elena Taboko Taku      | 2024-04-10 14:19:07.197052 | Collected plants in Russia in 1981
 21352166 | A. Andreeva                | Elena Taboko Taku      | 2024-04-10 14:10:44.58855  | Collected plants in Russia in 1980
 21352164 | G. Yakovleva               | Elena Taboko Taku      | 2024-04-10 13:55:12.067619 | Collected plants in Russia in 1976
 21352163 | V. Shein                   | Elena Taboko Taku      | 2024-04-10 13:52:27.865396 | Collected plants in Russia in 1980
 21352162 | S. Borisenko               | Elena Taboko Taku      | 2024-04-10 13:50:53.240816 | Collected plants in Russia in 1980
 21352160 | T. Fedorova                | Elena Taboko Taku      | 2024-04-10 13:49:23.075608 | Collected plants in Russia in 1978
 21352158 | V. Golovanova              | Elena Taboko Taku      | 2024-04-10 13:44:53.973037 | Collected plants in Russia in 1980
 21352157 | N. Sidorenko               | Elena Taboko Taku      | 2024-04-10 13:36:57.756534 | Collected plants in Russia in 1980
 21352154 | A. Maneev                  | Elena Taboko Taku      | 2024-04-10 11:09:13.600432 | Collected plants in Russia in 1982
 21352152 | B. O'Donnell               | Searra Schell          | 2024-04-09 18:25:02.1307   | Collected in Katmai in 1993
 21352135 | N. Tumanova                | C. O. Webb             | 2024-04-05 17:41:31.472675 | Collector of Far Eastern Russia plants in 1971
 21352134 | V. Schvydkaya              | C. O. Webb             | 2024-04-05 17:39:51.876208 | Collector of Far Eastern Russia plants in 1985
 21352023 | G. Schelkovnikova          | C. O. Webb             | 2024-03-22 19:50:41.124828 | Collected plants with S.S Kharkevich in 1974
 21351967 | P. Hafker                  | Alison Whiting         | 2024-03-14 12:55:03.462628 | NEON - Utah project
 21351965 | Ian Pearse                 | C. O. Webb             | 2024-03-14 12:19:59.433642 | Volunteer for AKNHP in 2002
 21351962 | J. Batchelor               | Alison Whiting         | 2024-03-14 10:14:43.730232 | NEON - Utah project
 21351921 | Michael R. Howard          | Jessica K. Tir         | 2024-03-07 10:47:18.654563 | Collected plants and herps for the Walla Walla College collection (plants now housed at WSU Owenby Herbarium)
 21351920 | Lauren Baur                | Mariel L. Campbell     | 2024-03-06 19:46:45.630318 | Sevilleta LTER Research Scientist and Program Manager 2017-2023
 21351920 | Lauren Baur                | Mariel L. Campbell     | 2024-03-06 19:46:45.630318 | Sevilleta LTER Research Scientist and Program Manager
 21351913 | Harold Stowell             | Brooke Bogan           | 2024-03-06 12:27:29.169484 | Professor of Geological Sciences, The University of Alabama
 21351909 | Joann Stoddard             | Alison Whiting         | 2024-03-06 11:08:39.336368 | rehabilitates wildlife
 21351885 | B. Fraser                  | Derek S. Sikes         | 2024-03-05 13:38:39.472276 | Migrated from ALA database.
 21351881 | Nico Limon                 | Derek S. Sikes         | 2024-03-05 13:38:39.405224 | Healy, AK resident 2018
 21351875 | Bundtzen                   | Derek S. Sikes         | 2024-03-05 13:38:39.295728 | Invertebrate bulkload agent.
 21351872 | Schmidt                    | Derek S. Sikes         | 2024-03-05 13:38:39.262718 | C.E. Scmidt? G.A. Scmidt? Karl P. Scmidt? R. Schmidt?
 21351869 | P. Valkenberg              | Derek S. Sikes         | 2024-03-05 13:38:39.202204 | ALS volunteer Lepidoptera collector in AK
 21351868 | S. Weeks                   | Derek S. Sikes         | 2024-03-05 13:38:39.183082 | Bird collection agent.
 21351866 | B. Kelly                   | Derek S. Sikes         | 2024-03-05 13:38:39.13914  | Bird collection agent.
 21351835 | Don Carney                 | Derek S. Sikes         | 2024-03-05 13:38:29.950863 | Insect collector 1960
 21351834 | S. Kogl                    | Derek S. Sikes         | 2024-03-05 13:38:29.933996 | lepidoptera collector
 21351829 | M. Merrell                 | Derek S. Sikes         | 2024-03-05 13:38:29.842861 | Collector from Alaska Lepidoptera Survey Volunteer dataset
 21351818 | Snarski                    | Derek S. Sikes         | 2024-03-05 13:38:29.660891 | Invertebrate bulkload agent. Possibly David J. Snarski?
 21351808 | W. S. McAlpine             | Derek S. Sikes         | 2024-03-05 13:38:29.46636  | collected Lepidoptera in Alaska
 21351805 | Jared Hughey               | Derek S. Sikes         | 2024-03-05 13:38:29.414319 | Bee collector in Alaska 2021
 21351801 | Mark Detterman             | Derek S. Sikes         | 2024-03-05 13:38:29.341203 | collected Lepidoptera for Alaska Lepidoptera Survey
 21351798 | Buck                       | Derek S. Sikes         | 2024-03-05 13:38:29.300602 | Invertebrate bulkload agent.
 21351787 | Svendsen                   | Derek S. Sikes         | 2024-03-05 13:38:29.090336 | lepidoptera collector
 21351780 | Emily Blattmachr           | Derek S. Sikes         | 2024-03-05 13:38:28.987924 | Anchorage member of the public.
 21351778 | A. Bakke                   | Derek S. Sikes         | 2024-03-05 13:38:28.944342 | ALS volunteer Lepidoptera collector in AK
 21351772 | Olive Kanayurak            | Derek S. Sikes         | 2024-03-05 13:38:28.830954 | Barrow, AK resident 2021
 21351764 | Liz Masi                   | Derek S. Sikes         | 2024-03-05 13:38:28.699816 | collected Lepidoptera in AK
 21351751 | Chelonia Jones             | Derek S. Sikes         | 2024-03-05 13:38:28.435787 | Bee collector in Alaska 2021
 21351747 | K. Bevernitz               | Derek S. Sikes         | 2024-03-05 13:38:28.364936 | ALS volunteer Lepidoptera collector in AK
 21351740 | William A. Lehnhausen      | Derek S. Sikes         | 2024-03-05 13:38:28.24093  | Bird collection agent.
 21351724 | Paul Spitzer               | Derek S. Sikes         | 2024-03-05 13:38:27.955857 | Collector for Alaska Lepidoptera Survey
 21351714 | A. D. Robertson            | Derek S. Sikes         | 2024-03-05 13:38:27.771993 | collected Lepidoptera for Alaska Lepidoptera Survey
 21351711 | Jon Berrie                 | Derek S. Sikes         | 2024-03-05 13:38:27.72166  | Collector from Alaska Lepidoptera Survey Volunteer dataset
 21351709 | Aliza Segal                | Derek S. Sikes         | 2024-03-05 13:38:27.680154 | Bee collector in Alaska 2021
 21351706 | Ginger Scoggin             | Derek S. Sikes         | 2024-03-05 13:38:27.627148 | PhD, DNP, ANP-C, Anchorage, AK, 2016
 21351700 | C. Cattell                 | Derek S. Sikes         | 2024-03-05 13:38:27.525267 | Collector from Alaska Lepidoptera Survey Volunteer dataset
 21351691 | Vicky Koelzer              | Derek S. Sikes         | 2024-03-05 13:38:27.370989 | Gardener in Copper Basin, AK
 21351688 | L. Heimer                  | Derek S. Sikes         | 2024-03-05 13:38:27.312803 | ALS volunteer Lepidoptera collector in AK
 21351684 | Eric Pyne                  | Derek S. Sikes         | 2024-03-05 13:38:27.25229  | KW Philip ALS collector.
 21351680 | P. Redwood                 | Derek S. Sikes         | 2024-03-05 13:38:27.180336 | ALS volunteer Lepidoptera collector in AK
 21351675 | W. Arvey                   | Derek S. Sikes         | 2024-03-05 13:38:27.092862 | collected Lepidoptera in Alaska
 21351672 | Don Bee                    | Derek S. Sikes         | 2024-03-05 13:38:27.026079 | Collected fish with Cal Skaugstad in 1985
 21351667 | P. Merritt                 | Derek S. Sikes         | 2024-03-05 13:38:26.934564 | Insect collector Alaska
 21351659 | Terri Cole                 | Derek S. Sikes         | 2024-03-05 13:38:26.775315 | Collected in DNP, possible with Pat Pyne
 21351656 | J. Bente                   | Derek S. Sikes         | 2024-03-05 13:38:26.714895 | Lepidoptera collector
 21351643 | A. L. Sanchez              | Derek S. Sikes         | 2024-03-05 13:38:26.456186 | Bulkloaded MSB Mammal agent.
 21351640 | L. Halpin                  | Derek S. Sikes         | 2024-03-05 13:38:26.401436 | ALS volunteer Lepidoptera collector in AK
 21351636 | Bethany Walker             | Derek S. Sikes         | 2024-03-05 13:38:26.332221 | 2019 UAF Bug Camper, Fairbanks, AK
 21351631 | Eric Castro                | Derek S. Sikes         | 2024-03-05 13:38:26.250353 | Bee collector in Alaska 2021
 21351630 | T. Hudson                  | Derek S. Sikes         | 2024-03-05 13:38:26.233311 | collected Lepidoptera for Alaska Lepidoptera Survey
 21351615 | M. Jetton                  | Derek S. Sikes         | 2024-03-05 13:38:25.973261 | ALS volunteer Lepidoptera collector in AK
 21351612 | T. Dickel                  | Derek S. Sikes         | 2024-03-05 13:38:25.926398 | Collector from Alaska Lepidoptera Survey Volunteer dataset
 21351609 | Lyle Krichen               | Derek S. Sikes         | 2024-03-05 13:38:25.881901 | trapper in Cordova
 21351603 | H. Tagarook                | Derek S. Sikes         | 2024-03-05 13:38:25.764963 | collected Lepidoptera for Alaska Lepidoptera Survey
 21351601 | L. F. Elliott              | Derek S. Sikes         | 2024-03-05 13:38:25.718377 | Associated with R. Wilk
 21351599 | Meekin                     | Derek S. Sikes         | 2024-03-05 13:38:25.66781  | Lepidoptera collector
 21351588 | Blakeslee                  | Derek S. Sikes         | 2024-03-05 13:38:20.529297 | First initials unknown, collected for Alaska Lepidoptera Survey in 1975
 21351582 | G. Cranna                  | Derek S. Sikes         | 2024-03-05 13:38:20.42808  | ALS volunteer Lepidoptera collector in AK
 21351580 | Terri Wild                 | Derek S. Sikes         | 2024-03-05 13:38:20.391425 | field technician Seward Peninsula, AK 2013
 21351579 | Skyler Jordan              | Derek S. Sikes         | 2024-03-05 13:38:20.373707 | field technician Seward Peninsula, AK 2013 [assumed to be same as Skyler C. Jordan, bee collector in Alaska 2021]
 21351569 | Jeff Foley                 | Derek S. Sikes         | 2024-03-05 13:38:20.218304 | collected Lepidoptera for Alaska Lepidoptera Survey
 21351567 | J. Trent                   | Derek S. Sikes         | 2024-03-05 13:38:20.189499 | KWP, butterfly collector
 21351561 | Teri C. Wild               | Derek S. Sikes         | 2024-03-05 13:38:20.087479 | Bee collector in Alaska 2021
 21351560 | Walter T. Phillips         | Derek S. Sikes         | 2024-03-05 13:38:20.069083 | collected Lepidoptera in Alaska
 21351555 | Sue Quinlan                | Derek S. Sikes         | 2024-03-05 13:38:19.988178 | Migrated from ALA database.
 21351554 | B. Robertson               | Derek S. Sikes         | 2024-03-05 13:38:19.966345 | New Zealand collector on beetles 2002
 21351552 | T. Ward                    | Derek S. Sikes         | 2024-03-05 13:38:19.933408 | ALS volunteer Lepidoptera collector in AK
 21351542 | Bio 116 students           | Derek S. Sikes         | 2024-03-05 13:38:19.790886 | former agent type group
 21351541 | Lindsey Taylor             | Derek S. Sikes         | 2024-03-05 13:38:19.776148 | Bee collector in Alaska 2021
 21351514 | Anna-Marie Kokx            | Derek S. Sikes         | 2024-03-05 13:38:19.342074 | collector
 21351503 | Mary A. Calmes             | Derek S. Sikes         | 2024-03-05 13:38:19.151287 | Migrated from ALA database.
 21351498 | C. A. Pease                | Derek S. Sikes         | 2024-03-05 13:38:19.036977 | Charles A. Pease?
 21351495 | Karen Henderson            | Derek S. Sikes         | 2024-03-05 13:38:18.8914   | Collector for Alaska Lepidoptera Survey
 21351478 | Unalakleet School students | Derek S. Sikes         | 2024-03-05 13:38:18.601517 | former agent type group
 21351473 | G. Nielsen                 | Derek S. Sikes         | 2024-03-05 13:38:18.51093  | collected Lepidoptera for Alaska Lepidoptera Survey
 21351471 | M. Shepherd                | Derek S. Sikes         | 2024-03-05 13:38:18.380863 | Miss Margaret Shepherd?
 21351469 | G. Kunkle                  | Derek S. Sikes         | 2024-03-05 13:38:18.354872 | ALS volunteer Lepidoptera collector in AK
 21351467 | D. M. Olsen                | Derek S. Sikes         | 2024-03-05 13:38:18.315796 | D.M. Olson?
 21351465 | P. Bente                   | Derek S. Sikes         | 2024-03-05 13:38:18.284953 | ALS volunteer Lepidoptera collector in AK
 21351462 | V. Waldron                 | Derek S. Sikes         | 2024-03-05 13:38:18.230611 | Lepidoptera collector
 21351461 | Gene Darby                 | Derek S. Sikes         | 2024-03-05 13:38:18.213304 | citizen, Kenai, AK
 21351459 | R. Fuson                   | Derek S. Sikes         | 2024-03-05 13:38:18.175991 | Lepidoptera collector
 21351457 | J. Gorham                  | Derek S. Sikes         | 2024-03-05 13:38:18.128328 | collected for Alaska Lepidoptera Survey
 21351449 | R. Mackey                  | Derek S. Sikes         | 2024-03-05 13:38:17.973638 | Bulkloaded MSB Mammal agent.
 21351446 | F. Karpuleon               | Derek S. Sikes         | 2024-03-05 13:38:17.929703 | Collector from Alaska Lepidoptera Survey Volunteer dataset
 21351445 | J. Jacobs                  | Derek S. Sikes         | 2024-03-05 13:38:17.907213 | J.F. Jacobs? J.W. Jacobs? Jeremy Jacobs?
 21351432 | T. True                    | Derek S. Sikes         | 2024-03-05 13:38:17.703254 | Collector from Alaska Lepidoptera Survey Volunteer dataset
 21351417 | J. Perkins                 | Derek S. Sikes         | 2024-03-05 13:38:17.487678 | Lepidoptera collector in AK
 21351410 | Hollingsworth              | Derek S. Sikes         | 2024-03-05 13:38:17.379134 | Collector from Alaska Lepidoptera Survey Volunteer dataset
 21351408 | Ruby An                    | Derek S. Sikes         | 2024-03-05 13:38:17.338636 | PhD student working at Toolik Field Station Alaska, 2022
 21351405 | William Dade               | Derek S. Sikes         | 2024-03-05 13:38:17.28883  | Collector from Alaska Lepidoptera Survey Volunteer dataset
 21351401 | Tony Bakos                 | Derek S. Sikes         | 2024-03-05 13:38:17.226548 | Bee collector in Alaska 2021
 21351400 | K. Bury                    | Derek S. Sikes         | 2024-03-05 13:38:17.202589 | Lepidoptera collector
 21351383 | M. Macgow                  | Derek S. Sikes         | 2024-03-05 13:38:16.920149 | Or could be M. Maogow
 21351381 | Roberts                    | Derek S. Sikes         | 2024-03-05 13:38:16.894943 | ALS volunteer Lepidoptera collector in AK
 21351380 | Komarek                    | Derek S. Sikes         | 2024-03-05 13:38:16.880916 | Collector from Alaska Lepidoptera Survey Volunteer dataset
 21351379 | Charles Fahl               | Derek S. Sikes         | 2024-03-05 13:38:16.864203 | collected Lepidoptera for Alaska Lepidoptera Survey
 21351373 | E. Hibler                  | Derek S. Sikes         | 2024-03-05 13:38:16.764772 | ALS volunteer Lepidoptera collector in AK
 21351372 | E. Halpin                  | Derek S. Sikes         | 2024-03-05 13:38:16.745089 | Collector from Alaska Lepidoptera Survey Volunteer dataset
 21351361 | N. Threlkeld               | Derek S. Sikes         | 2024-03-05 13:38:16.565538 | Collector from Alaska Lepidoptera Survey Volunteer dataset
 21351360 | Christina Trimingham       | Derek S. Sikes         | 2024-03-05 13:38:16.548931 | Bee collector in Alaska 2021
 21351358 | Mitchell A. Parsons        | Derek S. Sikes         | 2024-03-05 13:38:16.515226 | Bee collector in Alaska 2021
 21351347 | Stream Ecology Class UAF   | Derek S. Sikes         | 2024-03-05 13:38:16.3071   | Probably from University of Alaska Fairbanks
 21351346 | K. Flaccus                 | Derek S. Sikes         | 2024-03-05 13:38:16.285121 | Collector from Alaska Lepidoptera Survey Volunteer dataset
 21351344 | B. Wood                    | Derek S. Sikes         | 2024-03-05 13:38:16.250742 | Insect Collector Alaska Caribou Creek Research
 21351342 | M. J. Kennedy-Smith        | Derek S. Sikes         | 2024-03-05 13:37:47.954922 | collected Lepidoptera for Alaska Lepidoptera Survey
 21351341 | Mitch Parsons              | Derek S. Sikes         | 2024-03-05 13:37:47.937632 | field technician Seward Peninsula, AK 2013
 21351337 | Edward Szafran             | Derek S. Sikes         | 2024-03-05 13:37:47.870091 | ALS volunteer Lepidoptera collector in AK
 21351323 | T. Ovenshine               | Derek S. Sikes         | 2024-03-05 13:37:47.62894  | collected Lepidoptera for Alaska Lepidoptera Survey
 21351322 | K. Wilk                    | Derek S. Sikes         | 2024-03-05 13:37:47.612703 | Associated with R. Wilk
 21351319 | Natalie Konig              | Derek S. Sikes         | 2024-03-05 13:37:47.56063  | Bee collector in Alaska 2021
 21351317 | S. Temple                  | Derek S. Sikes         | 2024-03-05 13:37:47.529654 | collected Lepidoptera for Alaska Lepidoptera Survey
 21351315 | Alivia Gonzalez            | Derek S. Sikes         | 2024-03-05 13:37:47.4981   | 2019 UAF Bug Camper, Fairbanks, AK
 21351298 | Don Richter                | Derek S. Sikes         | 2024-03-05 13:37:47.206773 | Collector from Alaska Lepidoptera Survey Volunteer dataset
 21351290 | Cassandra Brown            | Derek S. Sikes         | 2024-03-05 13:36:54.974288 | Bee collector in Alaska 2021
 21351289 | J. Campbell                | Derek S. Sikes         | 2024-03-05 13:36:54.949897 | Bird collection agent.
 21351281 | L. Jennings                | Derek S. Sikes         | 2024-03-05 13:34:13.470231 | collected Lepidoptera for Alaska Lepidoptera Survey
 21351275 | S. Campbell                | Derek S. Sikes         | 2024-03-05 13:34:13.395198 | Migrated from ALA database.
 21351267 | John D. Mertz              | Angela Linn            | 2024-03-04 18:41:16.341581 | UAM Ethnology and History
 21351257 | Dennis L. Shirley          | Alison Whiting         | 2024-03-02 16:51:25.487017 | retired UDWR employee
 21351254 | David Remple               | Alison Whiting         | 2024-03-02 16:34:08.904522 | Falconer
 21351252 | D. A. Fiedler              | Alison Whiting         | 2024-03-02 10:58:52.058268 | Masters in 1974 - St. Cloud State College
 21351175 | Ross M. Anderson           | Alison Whiting         | 2024-02-24 17:22:45.498319 | Ross lives in Sandy and raises parrots for a hobby.  He was the Hogle Zoo veterinarian for many years.
 21351155 | William R. Fraser          | Alison Whiting         | 2024-02-21 16:03:41.320016 | CEO of Polar Oceans Research Group
 21350937 | Natalie Lucero             | J. Tomasz Giermakowski | 2024-01-30 14:24:32.796167 | MSB:Herp
 21350915 | Matthew Campen             | Mariel L. Campbell     | 2024-01-26 17:11:31.136402 | UNMHS School of Pharmacy
 21350848 | Anna Brant                 | Katherine L. Anderson  | 2024-01-13 13:15:23.046587 | University of Washington graduate student

This is why it is key to make remarks visible within the agent pick tool. Seeing remarks will help disambiguate agent picks.

Seeing remarks will help disambiguate agent picks.

For the 1% (probable wild overestimation!) of agents who read (and understand - "ALS"==Amyotrophic Lateral Sclerosis, right?) them, maybe.

Putting those data where they belong makes them available to a much wider audience.

(And #7434 is the place to change how the pick works, but try it in test first.)

Dups

 agent_id | agent_type | preferred_agent_name |       creator       |        created_date        
----------+------------+----------------------+---------------------+----------------------------
 21351197 | person     | David Johnson        | Derek S. Sikes      | 2024-02-26 19:05:23.736418
 21295057 | person     | David Johnson        | Dusty L. McDonald   | 2015-10-06 11:30:13
 21352394 | person     | Elyssa Bush          | Kara Branchflower   | 2024-04-23 10:49:00.568389
 21352393 | person     | Elyssa Bush          | Kara Branchflower   | 2024-04-23 10:46:02.158056
 21352211 | person     | I. Pshenichnaya      | Elena Taboko Taku   | 2024-04-12 11:02:45.327919
 21352176 | person     | I. Pshenichnaya      | Elena Taboko Taku   | 2024-04-11 00:19:57.708599
 21351445 | person     | J. Jacobs            | Derek S. Sikes      | 2024-03-05 13:38:17.907213
 21350919 | person     | J. Jacobs            | Jessica Weller      | 2024-01-27 11:25:38.328081
 21333621 | person     | Lauren Wilson        | Zack Perry          | 2021-07-13 12:39:08.99178
 21300714 | person     | Lauren Wilson        | Erica Krimmel       | 2016-04-12 12:17:17
 21332078 | person     | Linda Moore          | Jonathan L. Dunnum  | 2021-03-16 13:19:03.703952
 21330095 | person     | Linda Moore          | Andrew Charles Doll | 2020-07-27 12:41:15.857541
 21352220 | person     | L. Mironova          | Elena Taboko Taku   | 2024-04-12 11:50:41.997135
 21352174 | person     | L. Mironova          | Elena Taboko Taku   | 2024-04-11 00:09:32.275383
 21352213 | person     | O. Babarykina        | Elena Taboko Taku   | 2024-04-12 11:13:00.603817
 21352177 | person     | O. Babarykina        | Elena Taboko Taku   | 2024-04-11 00:23:58.065582
 21352417 | person     | Tim Wheeler          | Mariel L. Campbell  | 2024-04-24 13:52:16.752771
 21282782 | person     | Tim Wheeler          | Jordan Metzgar      | 2014-10-22 15:45:14

didn't click the magic button

 agent_id | preferred_agent_name |    creator    |        created_date        
----------+----------------------+---------------+----------------------------
 21352408 | Mr. George Laflin    | Olivia Cimino | 2024-04-24 10:30:11.712287

I think some of these are due to lags in the cache. I have added agents and for a while, search won't find them except when I search the EXACT preferred name. I can see how people would add someone again....this needs training.

didn't click the magic button

what magic button?

one of the David Johnson carries the not the same as relationship

one of the J. Jacob carries the not the same as relationship

one of the Linda Moore carries the not the same as relationship

@camwebb I think some of these may be your students? See #7649 (comment)

Dupes addressed

lags in the cache

I did write the possibility of changing the cache time (including to zero) into the API, so there's no technical trick in allowing (defaulting, whatever) "us" an override. That could also be used to melt the server, so guidance needed.

needs training

Clearly.

what magic button?

Take your pick, there's one that'll go from....

Screenshot 2024-04-25 at 07 13 56

to

Screenshot 2024-04-25 at 07 14 06

and one that'll go from....

Screenshot 2024-04-25 at 07 14 24

to

Screenshot 2024-04-25 at 07 14 30

not the same as relationship

I can rewrite the scripts, but I think we're up to one example (which I of course can't find) of those being based on any sort of evidence, ignoring that in data meant for human review seems right. #7719 supports this, I think - creating the duplicate seems defensible, but I can't understand why the relationship was added. I'd still get rid of it if I could....

I just made an agent: Tom Rickman

this was after a research associate wrote: "when I try to enter the collector, Tom Rickman, I tried to create the person as before but it errors out no matter what. Tom is alive. It's really quirky. If I try to add any additional info on him then it just errors out. If I don't enter any of the fields I get the option to force create and then it errors out. "

I was able to create Tom Rickman but when I did i got a page that showed a unintelligible short list of two Tom Rickmans and a force create button - why two? I had searched and no one existed with that name?

I force created and it worked. Why did it not work for my research associate, Joey Slowik?

And then I told Joey I made the agent but he wrote back:

"Well I got the boxes to turn green but the errors still exist anywhere I put Tom's name. Ideas?

2024-4-25T10:45:30: FAIL: agent_1_name [ Tom Rickman ] is invalid; record_event_determiner [ Tom Rickman ] matches 0 agents; locality_attribute_1_determiner [ Tom Rickman ] matches 0 agents: {"message":"agent_1_name [ Tom Rickman ] is invalid; record_event_determiner [ Tom Rickman ] matches 0 agents; locality_attribute_1_determiner [ Tom Rickman ] matches 0 agents","status":"fail"}"

So I'd say this make new agents process is broken in many ways.

arctos-> order by preferred_agent_name,created_date desc;
 agent_id |  agent_type  | preferred_agent_name |       creator       |        created_date        
----------+--------------+----------------------+---------------------+----------------------------
 21351197 | person       | David Johnson        | Derek S. Sikes      | 2024-02-26 19:05:23.736418
 21295057 | person       | David Johnson        | Dusty L. McDonald   | 2015-10-06 11:30:13
 21352459 | organization | Diamond Superior     | Angela Linn         | 2024-04-29 14:19:33.499769
 21352458 | organization | Diamond Superior     | Angela Linn         | 2024-04-29 14:17:46.457997
 21351445 | person       | J. Jacobs            | Derek S. Sikes      | 2024-03-05 13:38:17.907213
 21350919 | person       | J. Jacobs            | Jessica Weller      | 2024-01-27 11:25:38.328081
 21333621 | person       | Lauren Wilson        | Zack Perry          | 2021-07-13 12:39:08.99178
 21300714 | person       | Lauren Wilson        | Erica Krimmel       | 2016-04-12 12:17:17
 21332078 | person       | Linda Moore          | Jonathan L. Dunnum  | 2021-03-16 13:19:03.703952
 21330095 | person       | Linda Moore          | Andrew Charles Doll | 2020-07-27 12:41:15.857541
 21352495 | person       | P. V. Nesterov       | Elena Taboko Taku   | 2024-04-30 21:13:28.868182
 10008785 | person       | P. V. Nesterov       | unknown             | 2013-12-16 21:49:31
 21352417 | person       | Tim Wheeler          | Mariel L. Campbell  | 2024-04-24 13:52:16.752771
 21282782 | person       | Tim Wheeler          | Jordan Metzgar      | 2014-10-22 15:45:14
 21352433 | person       | Tom Rickman          | Derek S. Sikes      | 2024-04-25 12:36:54.346303
 21352432 | person       | Tom Rickman          | Jozef A. Slowik     | 2024-04-25 12:09:25.013787
 21352431 | person       | Tom Rickman          | Jozef A. Slowik     | 2024-04-25 12:06:48.200331
(17 rows)

Can you restrict these searches to exclude those with relationship 'not the same as' please?

Cool stuff in remarks

 agent_id | preferred_agent_name |      creator      |        created_date        |                        attribute_value                        
----------+----------------------+-------------------+----------------------------+---------------------------------------------------------------
 21352576 | Sally Whetstone      | Elena Taboko Taku | 2024-05-08 00:11:27.394131 | Collected plants in Alaska in 1981.
 21352575 | Howard Ulrich        | Elena Taboko Taku | 2024-05-08 00:07:50.725251 | Collected plants in Alaska in 1985.
 21352574 | Richard G. Holoway   | Elena Taboko Taku | 2024-05-08 00:04:12.008903 | Collected plants in Alaska in 1979.
 21352573 | N. D. Atwod          | Elena Taboko Taku | 2024-05-07 23:59:59.088798 | Collected plants in United States, Arizona in 1973.
 21352572 | B. Mitchell          | Elena Taboko Taku | 2024-05-07 23:56:33.364086 | Collected plants in Canada in 1978.
 21352571 | K. Paige             | Elena Taboko Taku | 2024-05-07 23:53:06.191529 | Collected plants in Canada in 2001.
 21352570 | J. L. Penny          | Elena Taboko Taku | 2024-05-07 23:51:20.805321 | Collected plants in Canada in 2001.
 21352569 | J. Miner             | Elena Taboko Taku | 2024-05-07 23:46:32.015662 | Collected plants in Alaska in 2008.
 21352557 | Alfonso Doucette     | Elena Taboko Taku | 2024-05-06 23:44:27.330045 | Collected plants in Alaska in 2007.
 21352554 | Giovana D'Angelo     | Mingna Zhuang     | 2024-05-06 10:04:32.123804 | graduate student of entomology
 21352548 | N. Fedorova          | Elena Taboko Taku | 2024-05-05 15:09:30.370723 | Collected plants in Turkmenistan in 1940 with Al. A. Fedorov.
 21352546 | J. Ryder             | Elena Taboko Taku | 2024-05-05 00:39:57.671707 | Collected in Canada with Bruce A. Bennett in 2005.
 21352545 | J. Line              | Elena Taboko Taku | 2024-05-05 00:36:03.159676 | Collected plants in Canada with Bruce A. Bennett in 2005.
 21352544 | O. Ceska             | Elena Taboko Taku | 2024-05-05 00:28:20.448506 | Collected plants in Canada in 2002.
 21352543 | C. S. Tomb           | Elena Taboko Taku | 2024-05-04 00:13:29.460839 | Collected plants in Russia in 1978.
 21352542 | J. B. McCarthy       | Elena Taboko Taku | 2024-05-04 00:06:23.235148 | Collected plants in Alaska in 1984.
 21352540 | Ann Odasz            | Elena Taboko Taku | 2024-05-03 23:36:08.575995 | Collected plants in United States, Wyoming in 1979.
 21352538 | J. Alto              | Elena Taboko Taku | 2024-05-03 13:40:47.843067 | Collected plants in Alaska in 1985.
 21352537 | Anders Michelsen     | Elena Taboko Taku | 2024-05-03 13:38:37.474309 | Collected plants in Greenland in 1984.
 21352536 | Helle Byrge          | Elena Taboko Taku | 2024-05-03 13:37:05.162128 | Collected plants in Greenland in 1984.
 21352535 | L. S. Dick           | Elena Taboko Taku | 2024-05-03 13:33:12.549559 | Collected plants in Alaska in 1970.
 21352534 | F. LeBlanc           | Elena Taboko Taku | 2024-05-03 13:29:41.051965 | Collected plants in Canada in 1961.
 21352533 | F. Ernest            | Elena Taboko Taku | 2024-05-03 13:23:07.398095 | Collected plants in Canada in 1961.
 21352532 | L. Duhamel           | Elena Taboko Taku | 2024-05-03 13:21:15.439678 | Collected plants in Canada in 1961.
 21352529 | L. M. Zudova         | Elena Taboko Taku | 2024-05-03 12:54:46.906752 | Collected plants in Russia in 1971.
 21352518 | V. N. Kononov        | Elena Taboko Taku | 2024-05-02 00:43:32.724689 | Collected plants in Moldova in 1949 -1958.
 21352515 | Marcie E. Mondt      | Alyssa Semerdjian | 2024-05-01 14:32:22.871172 | Cal Poly Humboldt student in early 1990s

I just searched on Tom Rickman who has 3 agent IDs in your prior list and yet only 1 agent is found from my search. What's up? I expected to find 3.

What's up?

Bug - thanks, I'll squash it for next release.

relationship

Here are dups which don't have any relationships. (Which could involve the worst possible scenario: A well-known agent isn't getting proper attribution because there's a duplicate, and now I'm ignoring that in the reports-or-whatever because one of them has some data. That'll need careful handling if this goes anywhere.)

On the subject of going somewhere, please see #7649 (comment) - my goal here isn't to clean up little bits and pieces, it's to develop policy regarding the extent to which low-quality data should be an "Arctos problem," and what "low quality" means plus what I can do about it if we do want to establish any sort of standards/suggestions/procedures/reports/whatever.

 agent_id |  agent_type  | preferred_agent_name |      creator       |        created_date        
----------+--------------+----------------------+--------------------+----------------------------
 21352459 | organization | Diamond Superior     | Angela Linn        | 2024-04-29 14:19:33.499769
 21352458 | organization | Diamond Superior     | Angela Linn        | 2024-04-29 14:17:46.457997
 21352495 | person       | P. V. Nesterov       | Elena Taboko Taku  | 2024-04-30 21:13:28.868182
 10008785 | person       | P. V. Nesterov       | unknown            | 2013-12-16 21:49:31
 21352417 | person       | Tim Wheeler          | Mariel L. Campbell | 2024-04-24 13:52:16.752771
 21282782 | person       | Tim Wheeler          | Jordan Metzgar     | 2014-10-22 15:45:14
 21352433 | person       | Tom Rickman          | Derek S. Sikes     | 2024-04-25 12:36:54.346303
 21352432 | person       | Tom Rickman          | Jozef A. Slowik    | 2024-04-25 12:09:25.013787
 21352431 | person       | Tom Rickman          | Jozef A. Slowik    | 2024-04-25 12:06:48.200331
(9 rows)

The aforementioned bug involved person-agents with no first or last name. Note that the create form will suggest these values with one click.


 agent_id | preferred_agent_name |     creator     |        created_date        
----------+----------------------+-----------------+----------------------------
 21352432 | Tom Rickman          | Jozef A. Slowik | 2024-04-25 12:09:25.013787
 21352431 | Tom Rickman          | Jozef A. Slowik | 2024-04-25 12:06:48.200331
 21352408 | Mr. George Laflin    | Olivia Cimino   | 2024-04-24 10:30:11.712287
 21352122 | Jack Spratt          | Jozef A. Slowik | 2024-04-05 12:12:44.141425
 21352119 | C. Stillman          | Jozef A. Slowik | 2024-04-04 16:05:10.937667
 21352117 | B. S. Blitz          | Jozef A. Slowik | 2024-04-04 15:34:50.688057
 21352116 | F. Sorensen          | Jozef A. Slowik | 2024-04-04 15:25:19.630444
 21352109 | M. Rosy              | Jozef A. Slowik | 2024-04-04 12:59:13.243463


From error logs: https://arctos.database.museum/agent.cfm?agent_name=leah%25barr

Cleaned up, but as long as untrained students are doing this, I expect it to happen daily....

untrained students

Setting policy on that sort of thing seems like something Arctos (the community, not the techy-bits) could do.

Relationship-free duplicate-ish:


 agent_id |  agent_type  |  preferred_agent_name  |         creator          |        created_date        
----------+--------------+------------------------+--------------------------+----------------------------
 21352459 | organization | Diamond Superior       | Angela Linn              | 2024-04-29 14:19:33.499769
 21352458 | organization | Diamond Superior       | Angela Linn              | 2024-04-29 14:17:46.457997
 21331498 | person       | Guillermo D’Elía       | James L. Patton          | 2020-12-13 16:25:11.485838
 21247708 | person       | Guillermo D'Elía       | unknown                  | 2013-12-16 21:49:31
 21280994 | person       | Jack DeVille           | Dusty L. McDonald        | 2014-04-30 14:23:48
 21279626 | person       | Jack De Ville          | Dusty L. McDonald        | 2014-04-30 14:01:26
 21352605 | person       | Jorge Galindo Gonzalez | Jonathan L. Dunnum       | 2024-05-09 13:02:13.260878
 21352604 | person       | Jorge Galindo-Gonzalez | Jonathan L. Dunnum       | 2024-05-09 13:00:41.018711
 21334283 | person       | J. O. Sullivan         | Teresa J. Mayfield-Meyer | 2021-09-07 16:29:24.897153
     7604 | person       | J. O'Sullivan          | unknown                  | 2013-12-16 21:49:31
  1017329 | person       | LaRue                  | unknown                  | 2013-12-16 21:49:31
 21253481 | person       | La Rue                 | unknown                  | 2013-12-16 21:49:31
  1011480 | person       | L. VanHorn             | unknown                  | 2013-12-16 21:49:31
  1010287 | person       | L. Van Horn            | unknown                  | 2013-12-16 21:49:31
 21253957 | person       | Mary O’Donnel          | unknown                  | 2013-12-16 21:49:31
 21256873 | person       | Mary O'Donnel          | unknown                  | 2013-12-16 21:49:31
 21286922 | person       | Norma Le Veque         | Dusty L. McDonald        | 2014-11-07 12:22:41
 21285505 | person       | Norma LeVeque          | Dusty L. McDonald        | 2014-11-07 12:22:32
 21352495 | person       | P. V. Nesterov         | Elena Taboko Taku        | 2024-04-30 21:13:28.868182
 10008785 | person       | P. V. Nesterov         | unknown                  | 2013-12-16 21:49:31
 21258795 | person       | Rößner                 | unknown                  | 2013-12-16 21:49:31
 21257004 | person       | Röner                  | unknown                  | 2013-12-16 21:49:31
 21352417 | person       | Tim Wheeler            | Mariel L. Campbell       | 2024-04-24 13:52:16.752771
 21282782 | person       | Tim Wheeler            | Jordan Metzgar           | 2014-10-22 15:45:14
 21352433 | person       | Tom Rickman            | Derek S. Sikes           | 2024-04-25 12:36:54.346303
 21352432 | person       | Tom Rickman            | Jozef A. Slowik          | 2024-04-25 12:09:25.013787
 21352431 | person       | Tom Rickman            | Jozef A. Slowik          | 2024-04-25 12:06:48.200331

@DerekSikes can you give us any idea why this happened?


 agent_id |  agent_type  |  preferred_agent_name  |         creator          |        created_date        
----------+--------------+------------------------+--------------------------+----------------------------
 21352433 | person       | Tom Rickman            | Derek S. Sikes           | 2024-04-25 12:36:54.346303
 21352432 | person       | Tom Rickman            | Jozef A. Slowik          | 2024-04-25 12:09:25.013787
 21352431 | person       | Tom Rickman            | Jozef A. Slowik          | 2024-04-25 12:06:48.200331

@AJLinn any idea why this happened?


 agent_id |  agent_type  |  preferred_agent_name  |         creator          |        created_date        
----------+--------------+------------------------+--------------------------+----------------------------
 21352459 | organization | Diamond Superior       | Angela Linn              | 2024-04-29 14:19:33.499769
 21352458 | organization | Diamond Superior       | Angela Linn              | 2024-04-29 14:17:46.457997

@jldunnum any idea why this happened?


 agent_id |  agent_type  |  preferred_agent_name  |         creator          |        created_date        
----------+--------------+------------------------+--------------------------+----------------------------
 21352605 | person       | Jorge Galindo Gonzalez | Jonathan L. Dunnum       | 2024-05-09 13:02:13.260878
 21352604 | person       | Jorge Galindo-Gonzalez | Jonathan L. Dunnum       | 2024-05-09 13:00:41.018711

I am hopeful that we can figure out why these almost-immediate duplicates were made. Is there something going on that could help make adding agents better?

mkoo commented

Might be the feedback lag. Once you create or edit an agent, the cache is a serious instant-gratification/ acknowledment impediment. So seems to me multiples get made because the user receives no confirmation or feedback and thinks nothing has happened (so you do it again!) I have had this happened to me with editing profiles and adding in more data to agents.

I do think there's a technical solution to this whether UI/UX or maybe a backend resource reallocation to make this more realtime-y?

lag

yes, I'm pretty sure that's part of it.

realtime-y

Week-ish done: #7738 (and only maybe-one meltdown, not too bad!)

mkoo commented

lag

yes, I'm pretty sure that's part of it.

So maybe some nice obvious feedback banners, when someone hits buttons!? (You just made an agent! Edit SAVED!) Could be applied generally with button pressing action

realtime-y

Week-ish done: #7738 (and only maybe-one meltdown, not too bad!)

well one of these was made today; we could see if there's further suspicious duplicate agents made or maybe make more realtimey?

more realtimey

Uhh - the spice must flow??

I think it is in real time, or as close as the speed of stuffing photons through a few hundred miles of fiber optics allows. If there's some way to make that not happen, I'd REALLY like to know about it. If there's some other bug, I'd also like to know about that. UI improvement ideas always welcome as well.

mkoo commented

haha, yea if that's the case no need for hallucinated realities (I thought i read that it was on a less than weekly update)!

then we are left with some UI/UX tweaks? I dont know if any other guardrails to actual creation is needed right now since it will put us right back where we were. I might need a realtime chat to spitball more ...

weekly

GAK! No, it was something less than an hour!

guardrails to actual creation

One alternative possibility is reports as above (but maybe not with only Teresa fixing everything...).

chat

Yea, probably.

Also worth mentioning that the vast majority of agents are pretty good. There's one verified agent without additional information (University of Colorado by @ebraker), and ~700 of the ~2000 created in the last 6 months lack clarifying data (most of those are me pretending to be Derek recovering verbatim agents - many of the rest do have some information, it's just stuffed into remarks - report here ).

This should run in writeSQL if anyone's curious:

select
    agent.agent_id bare_id,
    'https://arctos.database.museum/agent/'||agent.agent_id agent_id,
    agent.preferred_agent_name,
    getPreferredAgentName(agent.created_by_agent_id) creator
from 
    agent
    left outer join agent_attribute on agent.agent_id=agent_attribute.agent_id and agent_attribute.attribute_type in (
        select attribute_type from ctagent_attribute_type where purpose in ('address','identifier','relationship')
        union select 'event' attribute_type
    ) 
where 
    agent_attribute.attribute_id is null and
    created_date > current_timestamp - interval '6 months'

I have to say - doing this "merge" was a giant pain in the ... and took up way too much of my time. We need something a little more automated - at least for moving all of the attributes from the "bad duplicate" to the agent that will remain.

@jldunnum thanks - if that happens again can you get us a screenshot of the error?

@dustymc see above - maybe a clue to the duplicates created recently?

Resolved #7649 (comment)

Made Jorge Galindo Gonzalez a bad duplicate of Jorge Galindo-Gonzalez

There is a bug (squashed for next release) on that pathway, but I can't make it skip the duplicate notification. There may also be some complications when the duplicate has no name components, just a preferred name - allowing really low-quality data definitely has some not-great influence on future actions. Anyway, things should be slightly better in the near future, details on whatever lead to any sort of problem are always appreciated, thanks!

need something a little more automated

It was a different system, this isn't a "no," but: way back when we did that, it turned out to mostly be a very useful way to make problems immortal. Somehow I think we'd need a 'careful person saving only the good stuff' filter in there, I'm not sure how we might do that.

a 'careful person saving only the good stuff' filter in there, I'm not sure how we might do that.

A download of attributes from the bad duplicate that could be uploaded to the good would be better than a ton of copy-pasta.

@AJLinn any idea why this happened?

Those were created 2 minutes apart from one another. If I remember right, I might have clicked out of the agent creation pop-up to look at something else relating to the agent name, and the pop-up window closed but must have saved without hitting create agent. So I went back in and created the agent again.

https://docs.google.com/spreadsheets/d/1sosC-w8xHpyXD0g_x1n2-ub37jEe_CK39cj-mtic9dk/edit#gid=384759468 is a spreadsheet of agents who share first and last name. The temp_agent_share_firstlast_mc tab (multiple character names only) is probably sufficiently overwhelming to decide if we'd like to do anything about any of this or not. There are definitely a few agents I picked up in a quickish skim which suggest #7649 (comment) (eg we are failing both operators and contributors with our lack of training-or-something).

One of these has been marked as a duplicate, but not by the creator. Disallowing shared email addresses would address some of this:

https://arctos.database.museum/agent/21352394
https://arctos.database.museum/agent/21352392
https://arctos.database.museum/agent/21352393

Maybe John just really dislikes his last name, but it still lead to a duplicate

https://arctos.database.museum/agent/21352170
https://arctos.database.museum/agent/21352628

??????????? maybe something about the search UI ????????????????

https://arctos.database.museum/agent/21352600
https://arctos.database.museum/agent/21317611

These have the same name, same email address, created by the same person, and both have operator accounts! Even if we do nothing in the name of proper attribution, we should find some way to avoid this situation as a matter of security. And another case where the not-actually-emails are causing tangible problems.

https://arctos.database.museum/agent/21339858
https://arctos.database.museum/agent/21339906

2 minutes apart

I think that was probably related to #7738. I just clicked as fast as I can and got the expected warning.

Screenshot 2024-05-14 at 15 34 07

@Jegelewicz

I think some of these may be your students?

Yes, Elena is working with us at UAM:Herb. She's doing a super job doing detective work on Russian collectors, of which we have hundreds of names with little or no metadata. But some dups may slip through - I'll check.

@dustymc

here are a lot of agents being created with various could-be-important data stuffed into remarks.

Where else should these comments go? If all we know is that a name was a collector in a place at a time there are no additional agent relationships that can be made, but having some info in the remarks can help point another user in the right direction.

Remarks are expected to contain "I'm not sure how to spell 'pumpkin'" and "agent known to like tatertots" and EVERYTHING else - except the stuff which has typed fields, such as places and dates.

"There, then, doing that" in remarks is useful - to the maybe 1% of people who read remarks and are able to successfully figure out what it means.....

I messed with https://arctos.database.museum/agent/21352177 (and maybe made some bad assumptions, please review). Now that record contains....

Screenshot 2024-05-22 at 07 33 54
  • typed (sorted, labeled, organized, whatever you want to call it) place information!!!! "Russia" is still a string and doesn't carry much precision, but it's where humans expect that sort of thing and even machines can do basic stuff with it.
  • date information!! Humans and computers can do all sorts of things with that (starting with finding it), and it even helps define 'Russia' - that's well after the empire had expanded to Asia, ferexample.

Same information (assuming I didn't muck it up or make bad assumptions), organized so that it's MUCH more useful for all sorts of things, including finding those maybe-inevitable duplicates (eg "find ALA agents reported as doing stuff in 1984" just became possible).

Thanks @dustymc - that makes sense, and of course I agree about entering data into dedicated fields is always better than loose text. But... it's also a leap to say that the correspondence address of a person who collected in Russia is "..., ..., Russia". Maybe a new field 'active in' would be good. Or... better, an auto-generated list of countries and dates of records that the agent is associated with in Arctos.

Back to the larger, old issue: should we make agents at all? I reread the handbook and it's very clear that it's often better not to create agents at all. But... as @DerekSikes and others point out, verbatim agents don't play well with reports, labels. A single agent model (real or verbatim) is just easier to deal with and we're trying to use 'real' agents. That said, not making agents would speed up our transcription process hugely - I reckon ~50% of my own time spent on bulkloading data entered by assistants is spent on reconciling agents - if I pushed everything into verbatim agents it would be a doddle. I recently got Elena to start researching Russian agents, and she's doing great, but we simply don't have much info on the majority of names.

Perhaps a community-wide event is needed, as suggested above?

Maybe a crazy idea but how about: all agents in catalog records are verbatim; all agents in the agent table are real; if they match 100% then a relationship exists, if they do not, then it doesn't (but nothing bad happens, that is, no regulations against having v-agents in catalog record with no real agent matching).

All reports from catalog records would use the verbatim agent field of the catalog record.

correspondence address

Yea, there's an issue somewhere, I lost the argument that we need some less-addressey-address-thing, feel free to start it again, I'll agree with you!

auto-generated list of countries and dates of records that the agent is associated with in Arctos

That very nearly always has negative value.

  • you assert a funky agent, geography, whatever
  • scripts or people or whatever find that and say "agent <---> whatever"
  • You notice and fix your horrible mistake
  • Everyone spends the next forever trying to understand this association

#7796 - pulling live data in whatever form - is of course fine, anyone can go check it in context.

If that's what you're doing then remarks is probably the best mechanism, method would be very useful, and my "interpretation" likely vastly overplays the available hand.

not making agents would speed up our transcription process hugely

My position (which lead to heavy verbatimization, which then somehow lead here) has not changed: Make agents if they DO STUFF for you, don't if they don't. If you know "John Doe" then you're losing nothing by using verbatim, it can easily carry all you have. If you know it's that John Doe then you need an agent-object to carry that information.

match 100%

Multiple people named John Doe have existed.

verbatim agents don't play well with

Not sure I buy that, there were no actionable requests to remap or such.

reports

I'm always happy to help with them, they can use whatever you want them to use.

A single agent model (real or verbatim) is just easier to deal with

That is at least the point that made sense to me, and clearly some agents do have information that verbatim can't carry (shipping addresses, ORCIDs, etc.) so here we are. I'd still use verbatim if I had "verbatim-level" data, but I'm not going to push anyone in that direction very enthusiastically either (pending guidance from The Community here, of course).

~50% of my own time spent on bulkloading data entered by assistants is spent on reconciling agents

And I reckon that's probably not very productive, because you're probably dealing with out-of-context strings. Much of the idea of verbatim was to delay that investigation until AFTER entry, when you have the context to notice (and can request tools to help you notice) the two John Does spend a lot of time in the same places, or have a huge temporal gap, or WHATEVER thing that's not generally available from the string "John Doe." Don't think there's anything hindering that right now, but I'm also not sure what level of resources I could devote to helping.

community-wide event

I'm begging for guidance here, yes please. If we want to set some quality standards then I can probably help with tools, if we don't then I've got plenty of other things to do!

verbatim agents don't play well with

Not sure I buy that, there were no actionable requests to remap or such.

Is there an existing SQL function (that you made) to concatenate agents and verbatim agents? ... for reports and labels?

Concatenation may be sufficient for many uses, but necessarily has to lose data about the order of collectors. If a specimen had three collectors: A, B and C (in that order) and its record has agents A and C, and verbatim agent B, then there is no way (other than remarks) to indicate the correct order of collectors - any concatenation will give A, C, and B.

I think we'll just push on with creating true agents, trying hard not to create duplicates or assign the wrong agent. It is time-consuming, but should create better overall information.

SQL function

There's one in https://arctos.database.museum/Reports/reporter.cfm?action=edit&report_id=85, lots of possibilities....

order of collectors

"Bugs Bunny and Elmer Fudd" is a perfectly cromulent verbatim collector....

agents A and C, and verbatim agent B,

If you know A and C then you can probably figure out B (even if it's just that they were some ephemeral being who probably doesn't have field notes), but sure, there are innumerable fringe cases where strings start having trouble carrying the load.

push on with creating true agents

Nobody seems to be suggesting otherwise here, seems reasonable.

should create better overall information.

I don't think that's the trend, but there are definitely defensible reasons to do that so rock on!

Screenshot 2024-06-10 at 11 21 59

??

@wellerjes see the comment above. Can you help us figure out why this happens?

What I think happened - volunteer could not find "Davidson Brothers Marble Co." because it was not an AKA of the original "Davidson Brothers Marble Company". I was reviewing her work and updated the "Co." to Company, then added the AKA without realizing that there was already an agent named that. I'll remind our volunteers working on agents to try different searches before creating an agent.

I think this is what's happening with duplicate agents--if someone doesn't search with a % or searches "first name+last name" when the agent is only entered as "first name+middle name+last name" (with no AKAs) then they're not finding the correct agent. I've done this before. If the agent's name appears differently throughout the records (J. Weller vs. Jessica Weller vs. J. L. Weller could all be me) it's not always obvious to someone that the agent is the same person, which is why they might ignore the big red box that says "this might be a duplicate"

I'm just not sure how to change this behavior. If people are only going to search one thing and give up, this will keep happening.

Can we somehow make the search less strict and find near matches?

somehow make the search less strict and find near matches

There's a whole thread of me saying that would lead here and everyone insisting that they were getting too many matches somewhere....

There's a whole thread of me saying that would lead here and everyone insisting that they were getting too many matches somewhere....

Also fair because when there are too many, adds just get made. I don't think we can stop humans from being human, we can just keep asking everyone to try harder.

when there are too many, adds just get made

Yea, there's a whole 'nuther thread of me saying that low-quality data inspires low-quality data....

https://arctos.database.museum/agent.cfm?srch=Davidson%20Brothers%20Marble%20Co.&include_verbatim=false&include_bad_dup=true - "This is the search you're looking for." does not have the problem described, at least as I understand it. I don't know if that's a UI problem (something I might address) or a documentation/training problem (something The Community might address), or something else entirely.

Screenshot 2024-06-12 at 06 51 23

There is some relevant documentation regarding "J. Weller vs. Jessica Weller vs. J. L. Weller":

A generic search, such as only a last name is preferred. This form is searching Agent Preferred Names, so a search for John Smith will not return the agent John H. Smith, but a search for Smith will return both.

https://handbook.arctosdb.org/how_to/How-to-Search-Agents.html

or a documentation/training problem (something The Community might address)

I think we have addressed it - the question is does anybody read or use documentation?

https://handbook.arctosdb.org/how_to/How-to-Create-Agents.html#before-creating-a-new-agent

does anybody read or use documentation?

That's the part we haven't addressed, training. Arctos is very hippy-commune-ish about how roles are handed out, maybe we've outgrown that. I'm not sure what exactly the alternative might be, but lots of things require some sort of training/testing/whatever and there must be thousands of models we could explore.

but a search for Smith will return both

A search for Smith will give you "CAUTION: Return limit exceeded, some data may be excluded. Please perform a more specific search to ensure accurate results."

I ran into this the other day searching for my volunteer Judy Miller, searching under Agent name for "Miller" I just about added her again, but was stopped when the agent creator found the agent I was looking for.

Yea, that's the other juggle-ball: I've got limited resources, I often don't have the capacity to send everything even when you might not get overwhelmed by it. Some of that's potentially fixable - eg do I really need to be including [all of whatever I'm currently including] in the 'anything' search, is there a better sort that might get "us" (unverified us - I'm already sorting by that) closer to the top, etc., etc.?

Duplicates:

 agent_id | agent_type | preferred_agent_name |         creator          |        created_date        
----------+------------+----------------------+--------------------------+----------------------------
 21346039 | person     | David C. Evans       | Joseph Hopkins           | 2022-10-03 08:42:01.420229
 21258378 | person     | David C. Evans       | unknown                  | 2013-12-16 21:49:31
 21334283 | person     | J. O. Sullivan       | Teresa J. Mayfield-Meyer | 2021-09-07 16:29:24.897153
     7604 | person     | J. O'Sullivan        | unknown                  | 2013-12-16 21:49:31
  1017329 | person     | LaRue                | unknown                  | 2013-12-16 21:49:31
 21253481 | person     | La Rue               | unknown                  | 2013-12-16 21:49:31
  1011480 | person     | L. VanHorn           | unknown                  | 2013-12-16 21:49:31
  1010287 | person     | L. Van Horn          | unknown                  | 2013-12-16 21:49:31
 21256873 | person     | Mary O'Donnel        | unknown                  | 2013-12-16 21:49:31
 21253957 | person     | Mary O’Donnel        | unknown                  | 2013-12-16 21:49:31
 21257004 | person     | Röner                | unknown                  | 2013-12-16 21:49:31
 21258795 | person     | Rößner               | unknown                  | 2013-12-16 21:49:31
 21352433 | person     | Tom Rickman          | Derek S. Sikes           | 2024-04-25 12:36:54.346303
 21352432 | person     | Tom Rickman          | Jozef A. Slowik          | 2024-04-25 12:09:25.013787
 21352431 | person     | Tom Rickman          | Jozef A. Slowik          | 2024-04-25 12:06:48.200331

Information only in remarks:

 agent_id | preferred_agent_name |       creator       |        created_date        |                              attribute_value                               
----------+----------------------+---------------------+----------------------------+----------------------------------------------------------------------------
 21352952 | Izak Veals           | Paige Wilson Deibel | 2024-06-26 13:28:50.081707 | student at University of Washington, employee of Burke Museum
 21352951 | Christina Stuhl      | Paige Wilson Deibel | 2024-06-26 13:27:37.568765 | student at University of Washington, volunteer in Burke Museum Paleobotany
 21352950 | Ray Cagnetta         | Paige Wilson Deibel | 2024-06-26 13:25:58.536078 | employee of Burke Museum, museology student at University of Washington
 21352949 | Ana Gutierrez        | Paige Wilson Deibel | 2024-06-26 13:23:06.542398 | volunteer for Burke Museum Paleobotany
 21352948 | Amanda Godfrey       | Paige Wilson Deibel | 2024-06-26 13:22:10.115826 | paleobotany volunteer at Burke Museum
 21352947 | Elena Stiles         | Paige Wilson Deibel | 2024-06-26 12:44:56.034333 | paleobotanist, PhD student at University of Washington
 21352916 | Lulu Gaustad         | Angela Linn         | 2024-06-21 16:41:38.110722 | UAM Ethnology and History
 21352914 | Margen Burke Riley   | Angela Linn         | 2024-06-21 12:38:46.148758 | UAM Ethnology and History
 21352897 | David M. Evans       | Michelle S. Koo     | 2024-06-17 20:56:55.810041 | associated with University of Wyoming in 1970s
 21352897 | David M. Evans       | Michelle S. Koo     | 2024-06-17 20:56:55.810041 | UWYMV collector active in the 1970s
 21352864 | Judith Price         | Mariel L. Campbell  | 2024-06-07 16:36:57.565727 | CMN

Duplicates

As for

J. O. Sullivan
J. O'Sullivan

and don't forget

John O. Sullivan

It is hard for me to say if these are one person, two people, or three without more information from the collections.

It does feel like J. O'Sullivan is just a mis-transcription of John O. Sullivan but I have no definitive proof. The collecting locations differ for J. O. Sullivan and John O. Sullivan so again, I think I would need more information to decide if they are the same person. Maybe @mkoo can figure it out with whatever they have in the MVZ:Arch collection?

Re: Tom Rickman - here's what Slowik emailed me back on Apr 25: "Additionally, when I try to enter the collector, Tom Rickman, I tried to create the person as before but it errors out no matter what. "

and "Tom is alive. It's really quirky. If I try to add any additional info on him then it just errors out. If I don't enter any of the fields I get the option to force create and then it errors out. "

and I replied: "Ok, I made an agent record for Tom Rickman. I also was presented with some arctos weirdness and asked to force create which I did and it worked. Arctos agents is being re-tooled so there's all sorts of buggy behavior I hope they iron out fast!"

And then:
"Well I got the boxes to turn green but the errors still exist anywhere I put Tom's name. Ideas?

2024-4-25T10:45:30: FAIL: agent_1_name [ Tom Rickman ] is invalid; record_event_determiner [ Tom Rickman ] matches 0 agents; locality_attribute_1_determiner [ Tom Rickman ] matches 0 agents: {"message":"agent_1_name [ Tom Rickman ] is invalid; record_event_determiner [ Tom Rickman ] matches 0 agents; locality_attribute_1_determiner [ Tom Rickman ] matches 0 agents","status":"fail"}

So not user error. Just users trying to get Arctos to behave!

So not user error. Just users trying to get Arctos to behave!

That error exists because there is more than one Tom Rickman and Arctos doesn't know which one to choose.

That might explain the later error but not the former before the agent was made (during the process of trying to make the first one)

Information only in remarks:

Screenshot 2024-06-27 at 6 05 24 PM

Can someone explain to me what this comment is about in terms of the issue of "agent guardrails" - I just added some additional information in both of those records but even prior to that they both had relationships with other established agents.

explain

Bad timing, I was on the wrong server, I fired off the wrong script, my script is broken (please let me know if so)..... who knows, if the good stuff isn't solely in remarks then yay everybody.

And my primary purpose here is still #7649 (comment), I'm just gathering some examples, seeing what might be possible, what The Community would like (and if we can figure out how to do that), etc. - if I'm questioning something that you think is OK, PLEASE let me know that too.

Why? #7894, immediately. There's some data in there that I think possibly shouldn't be loaded (but HOW?), maybe it's fine, maybe my standards are weird, maybe I'm not being paranoid enough, who knows, none of those are decisions that any of us want to make alone, HELP! (And it's all complicated by a bunch of us simultaneously experiencing personal issues, we're not ignoring you @javanveldhuizen!)

Here's fresh data with something in remarks, no relationships, no events, created in the last month.

 agent_id | preferred_agent_name |       creator       |        created_date        |                              attribute_value                               
----------+----------------------+---------------------+----------------------------+----------------------------------------------------------------------------
 21352952 | Izak Veals           | Paige Wilson Deibel | 2024-06-26 13:28:50.081707 | student at University of Washington, employee of Burke Museum
 21352951 | Christina Stuhl      | Paige Wilson Deibel | 2024-06-26 13:27:37.568765 | student at University of Washington, volunteer in Burke Museum Paleobotany
 21352950 | Ray Cagnetta         | Paige Wilson Deibel | 2024-06-26 13:25:58.536078 | employee of Burke Museum, museology student at University of Washington
 21352949 | Ana Gutierrez        | Paige Wilson Deibel | 2024-06-26 13:23:06.542398 | volunteer for Burke Museum Paleobotany
 21352948 | Amanda Godfrey       | Paige Wilson Deibel | 2024-06-26 13:22:10.115826 | paleobotany volunteer at Burke Museum
 21352947 | Elena Stiles         | Paige Wilson Deibel | 2024-06-26 12:44:56.034333 | paleobotanist, PhD student at University of Washington
 21352897 | David M. Evans       | Michelle S. Koo     | 2024-06-17 20:56:55.810041 | associated with University of Wyoming in 1970s
 21352897 | David M. Evans       | Michelle S. Koo     | 2024-06-17 20:56:55.810041 | UWYMV collector active in the 1970s
 21352864 | Judith Price         | Mariel L. Campbell  | 2024-06-07 16:36:57.565727 | CMN
(9 rows)

@dustymc script still not working?

See #7649 (comment)

Izak Veals definitely has stuff other than remarks.

working

I'm trying to figure out what that means! I added relationships, data below. I think I was trying to avoid derived data, the take-home (if The Community wants to consider this in any way) is that I probably can't exclude low-value relationships. I can't see much way to separate "they actually hang around here, we know this person" and "a vaguely similar name is scribbled on something that once passed through here for some reason." (So #7649 (comment) still looks worth investigation.)

 agent_id | preferred_agent_name |      creator       |        created_date        |                attribute_value                 
----------+----------------------+--------------------+----------------------------+------------------------------------------------
 21352897 | David M. Evans       | Michelle S. Koo    | 2024-06-17 20:56:55.810041 | UWYMV collector active in the 1970s
 21352897 | David M. Evans       | Michelle S. Koo    | 2024-06-17 20:56:55.810041 | associated with University of Wyoming in 1970s
 21352864 | Judith Price         | Mariel L. Campbell | 2024-06-07 16:36:57.565727 | CMN
(3 rows)

This is not producing the desired results, I'm killing it.