bernard
BuSo Pro
- Joined
- Dec 31, 2016
- Messages
- 2,648
- Likes
- 2,335
- Degree
- 6
Suppose you had 1000 rows of real estate data, each corresponding to a project with aggregate data (features, price ranges, sq feet etc) that had errors in it, specifically miscategorized bedrooms, making the aggregate data such as price ranges for studios or 1 bedroom faulty or likewise the size ranges faulty.
The data has been collected by scraping other real estate states who in turn have it from agents who are the one's making the "mistakes" likely to get more views when people sort by low price and such.
Would you attempt to clean this up algorithmically using some kind of data engineering woodoo or would you clean it up manually and perhaps build a backend to do so with Retool for example and have some virtual assistants do it.
Since its aggregate data and project features, it's unlikely to change a lot for a while, so I'm considering if yearly manual updates would be worth it. No one else seems to be doing this though, so I wonder if they don't care, however, I need accurate data for my purposes.
The data has been collected by scraping other real estate states who in turn have it from agents who are the one's making the "mistakes" likely to get more views when people sort by low price and such.
Would you attempt to clean this up algorithmically using some kind of data engineering woodoo or would you clean it up manually and perhaps build a backend to do so with Retool for example and have some virtual assistants do it.
Since its aggregate data and project features, it's unlikely to change a lot for a while, so I'm considering if yearly manual updates would be worth it. No one else seems to be doing this though, so I wonder if they don't care, however, I need accurate data for my purposes.